WO2019030776A1 - Robotic device driven by artificial intelligence (ai) capable of correlating historic events with that of present events for indexing of imagery captured in camcorder device and retrieval - Google Patents

Robotic device driven by artificial intelligence (ai) capable of correlating historic events with that of present events for indexing of imagery captured in camcorder device and retrieval Download PDF

Info

Publication number
WO2019030776A1
WO2019030776A1 PCT/IN2018/050521 IN2018050521W WO2019030776A1 WO 2019030776 A1 WO2019030776 A1 WO 2019030776A1 IN 2018050521 W IN2018050521 W IN 2018050521W WO 2019030776 A1 WO2019030776 A1 WO 2019030776A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
frames
events
given
video
Prior art date
Application number
PCT/IN2018/050521
Other languages
French (fr)
Inventor
Kumar ESWARAN
Original Assignee
Eswaran Kumar
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eswaran Kumar filed Critical Eswaran Kumar
Publication of WO2019030776A1 publication Critical patent/WO2019030776A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • G06F16/7328Query by example, e.g. a complete video frame or video sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/008Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour

Definitions

  • Robotic Device driven by Artificial Intelligence capable of correlating historic events with that of present events for Indexing of imagery captured i n Camcorder device and retrieval
  • This invention relates to the field of digital data processing of imagery.
  • Stil l further this invention relates to the field of Artificial Intelligence (A I) technology based for a robotic device operated by a computer program which can quickly recall a previous event connected with a present event and i ndex it to a previ ous scene whi ch was captured by an onl i ne camera and recorded.
  • a I Artificial Intelligence
  • F urthermore, thi s i nventi on rel ates to that of systems whi ch have the abi lity to predict the course of events which may immediately follow a given scene (event), much like what a human being does by being able to recall previously occurred historical events.
  • VA data analytics includes that of a method of separation of d-dimensional data by finding hyperplanes that each data point from every other. It included computation using, Non-iterative algorithms that perform the task whi ch were descri bed i n such attempts. T hese systems al so descri bed how a classification system can then be developed by using the algorithms and the various methods involved in performing classification tasks by using a suitable architecture of processing elements, determined by the algorithms was delineated.
  • the object of the present i nvention is to use Artificial Intelligence (A I) technology, so that a robotic device operated by a computer program can be trained to quickly recall a previous event connected with a present event and index it.
  • a I Artificial Intelligence
  • V A V ideo-Audio
  • the VA-System is an AI system, which works like a very quick memory device which can recall a previous eventfrom memory and play out the entire movie sequence from then on.
  • the key input i n this case, is an initial scene which approximates (but need not be exactly equal to) some scene i n the vi deo
  • each of the frames could be be either the original image or a dimension- reduced image of the original frame. Then, the dimension of such a frame is d-dimensional as explained by the table above.
  • the Audio data corresponding to a single frame will be considered as a c-dimensional " Point , in an abstract c-dimensional space.
  • ST E P 3 Say, for example if it is discovered from the previous step, that there are 11 points within a vicinity of 5 planes with respect to the point Z. Now say, out of these 11 points, one of the images which has the highest dot product has a labelled time stamp to tr, then this frame contains a scene closest to Z and thus this frame is recovered. (Alternatively, we may have a situation: 7 of the points have their labels (time stamps) closest to tr, 2 images have their time stamps labelled close to tu , and 2 images have their time stamps close to tv. One can then come to the reasonable conclusion that the frame with the time stamp tr is the required frame).
  • E xample 1 We considered a 52 minute video clip of a cartoon movie.
  • the movie consists of approximately 75,000 images. However, only 3 frames per second were sampled for training which then i nvolved a total of 9360 images. Then, the size of each image was reduced to 30x30 pixels; so that every image can be thought of as a point in 900 dimension space. All these 9360 points were separated by hyper planes it was found that only 20 hyper planes could separate each of the points. Then, for the testing phase a typical frame given in the movie which is not in the training set , the V A -System was able to find the closest frame and then play back the movie starting from that point.
  • the memory recall and play back was very accurate and was successful to a very large number of test images to significant levels. It must be mentioned that whole training and testing and validation of the results was done within a time frame of 10-12 minutes i n a Lap- top Computer. The recall and pi ay- back time was typically 0.02 seconds for a single image.
  • E xample 2 A n animated Akbar-Birbal Cartoon movie of duration 11 minutes 12.5 seconds was then taken; each frame size was reduced to 30X 30 pixels (i.e. each frame could be considered as a point in 900 dimension space). The total no. of points (frames) taken: 6725. (T hat i s 10 frames were taken per second). Out of thi s 5380 frames were used for trai ni ng and the remaining 1345 for testing. Therefore there was 5380 train points and the balance 1345 were taken as test points. It was found that all the train points were separated by 17 hyper planes and the total time taken for separation (training): 6.4 minutes (384seconds), the total time taken for testing was 0.6 minutes for all 1345 test points i.e. for each test point it took 0.02 second , the overall was accuracy: 92%.
  • Stepl We have considered a video and converted it into frames, and considered each frame as a sample point( scene).
  • Step2 We trained these sample points using the algorithm Separation of points by planes.
  • Step3 If a frame (that is not fed as a traini ng point) is given to the system, it detects where it exactly is in the video.
  • each frame per second (total 9712 frames)and each frame is considered as a point in n dimensional space.
  • each frame is reduced in size and to converted to 30*30 pixel data, thus each frame is considered as a point in a 900 dimension space.
  • Each of these 9712 points were separated by by hyper- planes and the Orientation V ector of each point (frame) is found and stored. This process is cal I ed trai ni ng the A I System to : L earn " the V i deo.
  • the AI System After learning " given a test point the AI System then detects where it belongs in the video with an accuracy of 97.5%.
  • Time taken for training the data is 1 min36sec and for the scene detection it took 0.01 seconds for each input example.
  • B el ow are the exampl es of origi nal and reduced whi ch are fed for trai ni ng.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Robotics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

A system and method which enables a robotic device to perform an intelligent learning and memory activity to provide a systematic methodology which is used to recall and play back by learning past events recorded in video and recall and play back any given event just like a human being, is provided. This enables quick recalling and recollecting of a previous event connected with a present event and indexing of it. It can be seen that the Video – Audio (VA) system, was able to find the closest frame and then play back the movie starting from that point. The memory recalling and recollection and play back had high levels of accuracy and was successful in a very large number of test images. The whole training and testing was done within 10-15 minutes in a Lap-top computer. Thus it assists in accentuating the learning and processing digital imagery, using Artificial Intelligence (AI) tools with accuracy for indexing a succession of images (with audio) recorded by devices such as a cam-corder.

Description

Robotic Device driven by Artificial Intelligence (AI) capable of correlating historic events with that of present events for Indexing of imagery captured i n Camcorder device and retrieval
FIE L D OF INV E NT ION:-
This invention relates to the field of digital data processing of imagery. Stil l further this invention relates to the field of Artificial Intelligence (A I) technology based for a robotic device operated by a computer program which can quickly recall a previous event connected with a present event and i ndex it to a previ ous scene whi ch was captured by an onl i ne camera and recorded. F urthermore, thi s i nventi on rel ates to that of systems whi ch have the abi lity to predict the course of events which may immediately follow a given scene (event), much like what a human being does by being able to recall previously occurred historical events.
BAC K G R OU ND OF INV E NT IO N:-
It i s wel I known that i n the f i el d of i magery and i mage processi ng, i ncl udi ng V i deo and Audio (VA) data, the sheer voluminous content of data known as frames colloquially poses a great challenge for analysis of the data and obtai ning meaningful inferences from the same.
Earlier approaches by the present i nventor, in this area of activity of VA data analytics includes that of a method of separation of d-dimensional data by finding hyperplanes that each data point from every other. It included computation using, Non-iterative algorithms that perform the task whi ch were descri bed i n such attempts. T hese systems al so descri bed how a classification system can then be developed by using the algorithms and the various methods involved in performing classification tasks by using a suitable architecture of processing elements, determined by the algorithms was delineated.
But, it can be seen that for rapi dity i n acti on of capturi ng a past event and correlati ng to that of a present event requi res an Artificial Intelligence (A I) based system which can work like a very quick memory device which can recal l a previous event from memory and play out the entire movie sequence from then on.
Further, an approximately equivalent scene from a past footage, and not necessarily exactly equal to that of a present scene is enough for the AI based system and algorithm to identify and recall and correlate to that of a present sequence in the V A data. O BI E CT O F T H E INV E NT ION:-
The object of the present i nvention is to use Artificial Intelligence (A I) technology, so that a robotic device operated by a computer program can be trained to quickly recall a previous event connected with a present event and index it.
It is further an object of the invention to relate to that of systems to predict the course of events which may immediately follow much like a human being does by recalling previously occurring historical events.
Still further it is an object of the invention for accentuating digital data processing of imagery, using Artificial Intelligence (AI) tools with reasonably good accuracy for indexing of imagery and V ideo and Audio (VA) data captured usi ng Cam-corder and similar recording devices.
B RIE F DE SC RIPTION OF T H E INV E NT IO N
Any given video sequence and several frames of it constituent data, are stored in a database which is accessible to a computer or to V ideo-Audio (V A) memory device, herei n and after called V A -System
T he task performed by the V A -System i s the f ol I owi ng:
(i) Learning of all the frames in the video (or a large samples of frames in the video)
(ii) The learning is done in such a manner that given another image frame or given a short audio clip, the V A -System will be able to:
(a) find the closest matching frame, or the frame which contains the closest audio sound
(b) find outthe 'time stamp' of the closest matching frame, say (tr) in the TableA above, after performing (a) and (b) the V A -System should be able to:
(iii) play out (i.e) recall the subsequent events after this time tr i.e. it should be able to perform retri eval of subsequent frames taken at any given i nstance of ti me.
Thus it can be observed that the VA-System is an AI system, which works like a very quick memory device which can recall a previous eventfrom memory and play out the entire movie sequence from then on. The key input i n this case, is an initial scene which approximates (but need not be exactly equal to) some scene i n the vi deo
DE SC RIPTIO N OF T H E INV E NTIO N IN DETAIL :-
The approach for solution of the problem is basically concerned with images, each frame of the i mage bei ng consi dered as a d -di mensi onal vector as shown i n the second col umn of the Table depicted below. (The procedure for audio recall is similar: it will be c-di mensi onal data which we need to analyze the Table).
Figure imgf000005_0001
If the input given a V ide containing L frames of data (say L=100,000 frames) each of the frames could be be either the original image or a dimension- reduced image of the original frame. Then, the dimension of such a frame is d-dimensional as explained by the table above.
It is then assumed that we are given a scene (image) which approximates some image in some frame of the V ideo. The Query for which an answer is sought through this exercise is (stated briefly): What is the time stamp tr of this frame in the V ideo?
B ef ore we answer the Q uery, we state as f ol I ows: This method uses the separation of planes algorithm which has been explained in detail in the previous patent in all completeness and hence will not be repeated here.
The V ideo data corresponding to the image in a single frame as d-dimensional "point, in an abstract d-dimensional space.
Simi larly, if the Audio data, is considered, then the audio data corresponding to a single frame will be considered as a c-dimensional "Point, in an abstract c-dimensional space.
However, in the explanation below we will be only describing the method as applied to the d- dimensional image data. The audio part of the frames can be analyzed, similarly, by using the same technique applied to the c-dimensional data points.
In order to answer the Query it the fol lowi ng steps are executed::
ST E P 1 : All the labelled images are separated (i.e. all the L=100,000 points) in d- dimensional space (for purposes of ill ustration we assume d=900) by hyper-planes. We use our Algorithm for the separation of points by hyper-planes. Say, for example that the result obtained from the Algorithm, is that all the 100,000 points are separated by (say) q=28 planes. The Algorithm will find all the 900 coefficients of each of the q hyper-planes (this involves 25,200 coefficients). The orientation vector (OV ) of each of the 100,000 points needs to be found. Each OV is a q dimensional hammi ng vector i.e. 28 bits (storing all these hamming vectors involves 28x 10A5 bits i.e. 350 Kilo Bytes of memory space).1
ST E P 2. Now as per the Query we are given a scene (image), this can be represented as a point Z in d-dimensional space. We and are then expected to find at which point of the V ideo a similar scene occurs and find out the time when it occurred and find its time stamp, say tr . We proceed as follows, since we are given the point Z we find its OV with respect to the q existing hyper planes discovered in ST E P 1.. Then by taking dot products of this Orientation V ector (OV) with the OV s of other points find those points which are "planar" neighbors of
Figure imgf000006_0001
point Z. (By definition "planar neighbours" of z are those points which are separated from Z by the fewest number of partition hyper-planes).
ST E P 3: Say, for example if it is discovered from the previous step, that there are 11 points within a vicinity of 5 planes with respect to the point Z. Now say, out of these 11 points, one of the images which has the highest dot product has a labelled time stamp to tr, then this frame contains a scene closest to Z and thus this frame is recovered. (Alternatively, we may have a situation: 7 of the points have their labels (time stamps) closest to tr, 2 images have their time stamps labelled close to tu , and 2 images have their time stamps close to tv. One can then come to the reasonable conclusion that the frame with the time stamp tr is the required frame).
ST E P 4: Having recalled the frame labelled as tr, this frame is provided as output and all the k frames after this frame, with labels equal to or higher than tr, are played out
T hus one can successful ly recal I a given scene from the V i deo knowi ng only one i mage. E numeration of the R esults obtained and Inferencing :-
We briefly give the results of two typical examples which was actually worked out by using our algorithms.
E xample 1 : We considered a 52 minute video clip of a cartoon movie. The movie consists of approximately 75,000 images. However, only 3 frames per second were sampled for training which then i nvolved a total of 9360 images. Then, the size of each image was reduced to 30x30 pixels; so that every image can be thought of as a point in 900 dimension space. All these 9360 points were separated by hyper planes it was found that only 20 hyper planes could separate each of the points. Then, for the testing phase a typical frame given in the movie which is not in the training set , the V A -System was able to find the closest frame and then play back the movie starting from that point. The memory recall and play back was very accurate and was successful to a very large number of test images to significant levels. It must be mentioned that whole training and testing and validation of the results was done within a time frame of 10-12 minutes i n a Lap- top Computer. The recall and pi ay- back time was typically 0.02 seconds for a single image.
E xample 2: A n animated Akbar-Birbal Cartoon movie of duration 11 minutes 12.5 seconds was then taken; each frame size was reduced to 30X 30 pixels (i.e. each frame could be considered as a point in 900 dimension space). The total no. of points (frames) taken: 6725. (T hat i s 10 frames were taken per second). Out of thi s 5380 frames were used for trai ni ng and the remaining 1345 for testing. Therefore there was 5380 train points and the balance 1345 were taken as test points. It was found that all the train points were separated by 17 hyper planes and the total time taken for separation (training): 6.4 minutes (384seconds), the total time taken for testing was 0.6 minutes for all 1345 test points i.e. for each test point it took 0.02 second , the overall was accuracy: 92%.
The above results were executed in laptop with 2gb configuration
For any typical frame, the following figure can be taken as the correlated results
Figure imgf000008_0001
EXA M PL E 31 intelligent V ideo Indexing
In this example we explicitly demonstrate our method and the results of the program which is capable of correlating historic events with that of present events for Indexing of imagery captured in Camcorder device and retrieval.
Problem Statement: Detecting an : Event" (scene) from a video and discover at which particular time it had occurred.
Solution:
Stepl : We have considered a video and converted it into frames, and considered each frame as a sample point( scene).
Step2: We trained these sample points using the algorithm Separation of points by planes. Step3: If a frame (that is not fed as a traini ng point) is given to the system, it detects where it exactly is in the video.
Figure imgf000009_0001
E xample dataset: (A V ideo from Y ouT ube):
A vi deo( durati on: 16mi ns11 seconds) i s taken from Y ouT ube:
hich is further divided into 10 frames
Figure imgf000009_0002
per second (total 9712 frames)and each frame is considered as a point in n dimensional space. Here, each frame is reduced in size and to converted to 30*30 pixel data, thus each frame is considered as a point in a 900 dimension space. Each of these 9712 points were separated by by hyper- planes and the Orientation V ector of each point (frame) is found and stored. This process is cal I ed trai ni ng the A I System to : L earn " the V i deo.
After learning" given a test point the AI System then detects where it belongs in the video with an accuracy of 97.5%.
Time taken for training the data is 1 min36sec and for the scene detection it took 0.01 seconds for each input example.
In the paragraphs bel ow we give three exampl es of detecti ng a given i nput frame i n the vi deo after the System has : I earnt" al I the 9712 frames given to it for trai ni ng.
B el ow are the exampl es of origi nal and reduced whi ch are fed for trai ni ng.
Figure imgf000010_0001
Figure imgf000011_0001
Figure imgf000012_0001
Figure imgf000013_0001

Claims

C LAIMS
What is claimed is
1) A Robotic Device driven by Artificial Intelligence (AI) capable of correlating historic events with that of present events for Indexing of imagery captured in Camcorder device and retrieval, the said Robotic Device comprising of
A V ideo Audio System, incorporated with a memory device, into which any given video sequence and several frames of it constituent data, are stored in a database which is accessible to a computer ;
the said V ideo Audio System (VA) in turn, learns, all the frames in the given video (or a large samples of frames in the video)
further, the learning is done in such a manner that given another image frame or given a short audio clip, the V A -System will be able to: find the closest matching frame, or the frame which contains the closest audio sound
and the VI system further finds the closest matching frame, or the frame which contai ns the cl osest audi o sound
and also find out the 'ti me stamp' of the closest matchi ng frame, after performi ng the above mentioned operations, the VA-System plays out (i.e) recall the subsequent events after this said time and is capable of performing the retrieval of subsequent frames taken at any given i nstance of ti me
2) The said VA-System of the Robotic Device operated by the A I system, as claimed in claim 1, is capable of working like a very quick memory device which can recall a previous event from memory and play out the entire movie sequence from then on.
A method for indexing of images captured in a Camcorder device and retrieval, the said method comprising of the steps of storing of video footages in a V ideo Audio Device (VA) incorporated with a memory device and accessible by a computer, and the said VA device, being capable of learning all frames i n the given video of large samples of its constituent frames and further the said VA device finds the closest matching frame, or the frame which contains the closest audio sound and identifies the time stamp of of the closest matching frame, and upon performing the above mentioned operations, the VA-System plays out (i.e) recalls the subsequent events after this said time and is capable of performing the retrieval of subsequent video & audi o frames taken from any given i nstance of ti me
PCT/IN2018/050521 2017-08-09 2018-08-09 Robotic device driven by artificial intelligence (ai) capable of correlating historic events with that of present events for indexing of imagery captured in camcorder device and retrieval WO2019030776A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201741028240 2017-08-09
IN201741028240 2017-08-09

Publications (1)

Publication Number Publication Date
WO2019030776A1 true WO2019030776A1 (en) 2019-02-14

Family

ID=65272013

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2018/050521 WO2019030776A1 (en) 2017-08-09 2018-08-09 Robotic device driven by artificial intelligence (ai) capable of correlating historic events with that of present events for indexing of imagery captured in camcorder device and retrieval

Country Status (1)

Country Link
WO (1) WO2019030776A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303440A1 (en) * 2009-05-27 2010-12-02 Hulu Llc Method and apparatus for simultaneously playing a media program and an arbitrarily chosen seek preview frame
US8818037B2 (en) * 2012-10-01 2014-08-26 Microsoft Corporation Video scene detection
US20150256746A1 (en) * 2014-03-04 2015-09-10 Gopro, Inc. Automatic generation of video from spherical content using audio/visual analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303440A1 (en) * 2009-05-27 2010-12-02 Hulu Llc Method and apparatus for simultaneously playing a media program and an arbitrarily chosen seek preview frame
US8818037B2 (en) * 2012-10-01 2014-08-26 Microsoft Corporation Video scene detection
US20150256746A1 (en) * 2014-03-04 2015-09-10 Gopro, Inc. Automatic generation of video from spherical content using audio/visual analysis

Similar Documents

Publication Publication Date Title
Kumar et al. Eratosthenes sieve based key-frame extraction technique for event summarization in videos
Dang et al. RPCA-KFE: Key frame extraction for video using robust principal component analysis
Kliper-Gross et al. The action similarity labeling challenge
WO2019147687A1 (en) Computer vision systems and methods for unsupervised representation learning by sorting sequences
EP3923182A1 (en) Method for identifying a video frame of interest in a video sequence, method for generating highlights, associated systems
JP2012523641A (en) Keyframe extraction for video content analysis
CN111027507A (en) Training data set generation method and device based on video data identification
Jones Developing cognitive theory by mining large-scale naturalistic data
Badre et al. Summarization with key frame extraction using thepade's sorted n-ary block truncation coding applied on haar wavelet of video frame
Zhu et al. Lavs: A lightweight audio-visual saliency prediction model
Lin et al. Joint learning of local and global context for temporal action proposal generation
Iodice et al. Hri30: An action recognition dataset for industrial human-robot interaction
Kini et al. A survey on video summarization techniques
Leong et al. Joint learning on the hierarchy representation for fine-grained human action recognition
WO2015198036A1 (en) Hash-based media search
CN112364852A (en) Action video segment extraction method fusing global information
CN110188277B (en) Resource recommendation method and device
CN115687676B (en) Information retrieval method, terminal and computer-readable storage medium
WO2019030776A1 (en) Robotic device driven by artificial intelligence (ai) capable of correlating historic events with that of present events for indexing of imagery captured in camcorder device and retrieval
WO2023036159A1 (en) Methods and devices for audio visual event localization based on dual perspective networks
Deotale et al. Optimized hybrid RNN model for human activity recognition in untrimmed video
CN113591647B (en) Human motion recognition method, device, computer equipment and storage medium
Nguyen et al. Learning generalized feature for temporal action detection: Application for natural driving action recognition challenge
Bagane et al. Facial Emotion Detection using Convolutional Neural Network
Sattar et al. Group Activity Recognition in Visual Data: A Retrospective Analysis of Recent Advancements

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18843013

Country of ref document: EP

Kind code of ref document: A1