CN104778224A - Target object social relation identification method based on video semantics - Google Patents

Target object social relation identification method based on video semantics Download PDF

Info

Publication number
CN104778224A
CN104778224A CN201510137760.1A CN201510137760A CN104778224A CN 104778224 A CN104778224 A CN 104778224A CN 201510137760 A CN201510137760 A CN 201510137760A CN 104778224 A CN104778224 A CN 104778224A
Authority
CN
China
Prior art keywords
personage
node
semantic
scene
semantic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510137760.1A
Other languages
Chinese (zh)
Other versions
CN104778224B (en
Inventor
陈志�
高翔
岳文静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fan Liyang
Li Bo
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201510137760.1A priority Critical patent/CN104778224B/en
Publication of CN104778224A publication Critical patent/CN104778224A/en
Application granted granted Critical
Publication of CN104778224B publication Critical patent/CN104778224B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a target object social relation identification method based on video semantics. The method comprises the following steps: firstly preprocessing video data inputted by a user to obtain a lens image frame sequence, extracting a key frame from the lens image frame sequence, extracting a characteristic vector of the key frame by virtue of an SVM learning model, storing lens semantics obtained by analyzing the characteristic vector to lens node in an XML file, classifying the lens nodes with identical character semantic nodes into a group of lens nodes according to the time at the node of each lens and the semantic node corresponding to each character, storing the data of each group of classified lens nodes into a scene node of the XML file according to the gradually-increases sequence of the node value with the name of time node, sequentially structuring a lens semantic sequence to represent the scene; and finally storing the semantic information and social relation of characters in the storage scene by utilizing the scene semantic matrixes one by one, and merging the character semantic information and social relations in all scene semantic matrixes into a large matrix representing the video semantics in a way for extracting a union set.

Description

A kind of destination object social networks recognition methods based on video semanteme
Technical field
The present invention relates to a kind of social networks recognition methods, by carrying out semantic analysis to video, identifying wherein implicit social networks, belonging to the interleaving techniques field of image procossing, social networks, software engineering.
Background technology
Social Media refer to allow people to write, share, evaluate, discuss, the website of communicating with each other and technology.It is instrument and the platform that people are used for sharing suggestion, opinion, experience and viewpoint each other, and present stage mainly comprises social network sites, microblogging, micro-letter, blog etc.Social Media is a kind of cloud service, and cloud computing technology widespread use and Social Media are inherently that a kind of Web based on cloud computing applies.
Social networks and social networking service, along with the image of the people of evolution silently on network of social activity is tending towards complete more, at this time social networks has occurred.External main representative product has Facebook, Twitter, and domestic main representative has Renren Network, happy net, Sina's microblogging etc.Instantly the hot spot technology of social networks is in conjunction with cloud computing, ecommerce and emotion perception technology.Social networks refers to the interpersonal relation in social communication, also refers to the relation in social networks between good friend.By contact and interactive frequency between user, social networks divides strong relation and the large class of weak relation two.
The structure of video generally can be divided into four levels from high toward low: video sequence, scene, camera lens, frame.Video sequence generally refers to an independent video file, or a video segment.Video sequence is made up of several scenes.Each scene comprises the relevant camera lens of one or more semanteme, and these camera lenses can be continuous print or spaced.Each case for lense is containing some continuous print picture frames.Video semanteme extracts can be decomposed into split the camera lens of video and do image, semantic to the camera lens after segmentation and extracts.Block-based video lens cutting techniques, can by shot segmentation relevant for content together, and then choose and close key frame in camera lens to represent this camera lens; Image, semantic extractive technique is one of committed step that video semanteme extracts, and mainly comprises the extraction of semantics of detection to destination object and classification and destination object.
SVM is a kind of learning model of support vector machine, is in fact a sorter.It shows many distinctive advantages in solution small sample, non-linear and high dimensional pattern identification.SVM is a kind of learning model having supervision of support vector machine, is commonly used to carry out pattern-recognition, classification and regretional analysis.It is in fact a sorter.It is that the divisible situation of linear is analyzed, for the situation of linearly inseparable, by using non-linear map the sample of low-dimensional input space linearly inseparable is converted into high-dimensional feature space makes its linear separability, thus make high-dimensional feature space adopt linear algorithm to carry out linear analysis to the nonlinear characteristic of sample to become possibility.
XML, as a kind of extend markup language, is a kind of important software engineering.It can with a kind of structure of the management information of mode flexibly, the structural relation had by different node hierarchy description information itself.Dom4j is that the XML that increases income that dom4j.org produces resolves bag, and user can use dom4j technology to read and write each node content of XML file.
The present invention utilizes the technology such as Video processing, SVM, XML to solve destination object social networks identification problem.
Summary of the invention
Technical matters: the object of this invention is to provide a kind of destination object social networks recognition methods based on video semanteme, Social Media resource contains abundant semantic information, wherein video obtains social semantic important sources, but current main dependence people identifies and identifies relevant social semantic, lack effective technology and excavate by software analysis semantic information the social networks that destination object in video contains, the object of the invention is to address this problem.
Technical scheme: first the present invention carries out pre-service to video data, becomes a series of camera lens by partitioning video data, obtains the semanteme collection of destination object in camera lens; Then, then according to the content of camera lens and sequential relationship, relevant camera lens is formed semantic sequence, form specific scene; Finally, the semantic information by analyzing scene excavates the social networks between destination object.
The destination object social networks recognition methods based on video semanteme that the present invention proposes comprises following steps:
Step 1) first pre-service is carried out to the video data of user's input, concrete treatment scheme is as follows:
Step 1.1) utilize block-based comparative approach to split video data, obtain the camera lens of this video data, described block-based comparative approach is the region unit image of each frame of video data being divided into user's specified quantity, different camera lenses is marked off by the similarity comparing the region unit between successive frame, wherein frame is a least unit i.e. two field picture of video data, camera lens is one group of continuous print frame sequence in video, the characteristic sum specific standards of the similarity of described region unit is specified by user, region unit between the successive frame of same camera lens has similarity,
Step 1.2) from each camera lens, extraction is in that frame in this camera lens frame sequence centre position as key frame successively, and this key frame represents this camera lens in subsequent treatment;
Step 2) extract the semanteme collection of the destination object that in all key frames, user specifies, form semanteme collection being converted to key-value pair is saved in the file of XML format; Described destination object comprises background object and foreground object two class, and foreground object is who object, and background object is place residing for personage, temporal information; Described semanteme collection is the set of the semantic information that in video, destination object extracts, and comprises the semanteme of background, time, dialogue, personage, color, shape, texture; The file of described XML format comprises 3 layers of nested node, ground floor is scene node, use <scene> labeled marker, described scene refers to the arrangement of mirrors header sequence according to the sequential relationship composition between the semantic information of camera lens and camera lens; The second layer is camera lens node, uses <short> labeled marker; Third layer is concrete semantic node, uses <key> labeled marker; The concrete treatment scheme extracting the semanteme collection of the destination object that in each key frame, user specifies is as follows:
Step 2.1) key frame is carried out to detection and the classification of destination object, extract all destination objects that this key frame packet contains, record the time point that dialog information in this key frame between personage and this key frame are arranged in the broadcasting of video simultaneously;
Step 2.2) extract the visual signature of all prospects of key frame and background object, form corresponding proper vector, the visual signature of described background object comprises color, texture; The visual signature of foreground object comprises color, texture, shape;
Step 2.3) with SVM, the proper vector of destination object in key frame is learnt, extract the semantic information of foreground object and background object; The semantic information of described foreground object is the semantic information of the visual behaviour performance of foreground object, comprises color, shape, texture, personage, dialogue; The semantic information that described background object is got is the environment semantic information residing for background object, comprises background, time, and described SVM is a kind of learning model having supervision;
Step 2.4) by the foreground object of key frame of acquisition and the semantic information of background object, be saved in the camera lens node of XML file according to the form of key-value pair under;
Step 3) analyzing step 2) node lower time of each camera lens of obtaining and the semantic node corresponding to personage, the camera lens node having the semantic node of identical personage is classified as an arrangement of mirrors head node; The semantic node of described personage be exactly in XML file under <short> node in <key> node name attribute be that key-value pair of personage;
Step 4) data of every arrangement of mirrors head node of having classified are called according to name the incremental order of the nodal value of time node is saved in the scene node of XML file under, construct camera lens semantic sequence successively, represent scene one by one;
Step 5) each scene node in analyzing XML file successively, analyze all semantic informations that it comprises, obtain the semantic information of relation between personage and personage, these information of each scene are saved in successively one by one in matrix, the element of every a line of these matrixes or each row stores the semantic information of relation between a personage and other personages and this personage, and the line number of each personage in a matrix or row number are specified by user; Described Scene Semantics information comprises the semantic information of social networks between personage and personage, and the concrete the treatment scheme wherein social networks of personage and the semantic information of personage being saved in a matrix is as follows:
Step 5.1) extract under XML file Scene node all camera lens nodes in semantic node, obtain the semantic information that this scene is all;
Step 5.2) from obtaining the semantic information finding out personage all semantic informations of scene, set up a matrix according to this, in matrix except diagonal entry, when line number and a column element of a row element arranges number identical, this row element and a column element represent the social networks of same personage, and cornerwise element preserves the semantic information of this personage; The line number of described cornerwise element and row are number identical;
Step 5.3) assignment is carried out to the element of scene homography, from step 5.1) all semantic informations of obtaining, social networks between extraction personage and the semantic information of personage, social networks and the semantic information of personage is preserved again successively, by aggregate assignment to the element of the correspondence position of matrix with set HashMap; Described set HashMap is a data acquisition being used for depositing key-value pair;
Step 6) obtain according to all matrixes representing Scene Semantics the matrix that represents video semanteme information; This matrix preserves semantic information and the social networks of all personages in video, the element of every a line of matrix or each row stores the semantic information of relation between a personage and other personages and this personage, the line number of each personage in a matrix or row number are specified by user, wherein each personage line number in a matrix or row number are specified by user, and idiographic flow is as follows:
Step 6.1) from the matrix of all Scene Semantics, extract all personages and each personage corresponding personage's semantic information set HashMap, successively union is got in these semantic information set, merge and be saved in a HashMap set, then the HashMap set after this merging is saved in the corresponding diagonal entry of matrix;
Step 6.2) from the matrix of all Scene Semantics, extract social networks between personage, according to personage's line number in a matrix or row number, successively union is got in the social networks set of identical personage, merge and be saved in a HashMap set, then the HashMap set after this merging is saved in each personage position in a matrix.
Beneficial effect: video content structuring is first preserved by the present invention, is convenient to computer recognizing and analyzes video semanteme, thus can effectively infer the social networks contained in video, has widened the mode excavating social networks.Specifically, the method belonging to the present invention has following beneficial effect:
(1) the present invention uses XML technology that the Content Transformation of video is become structured data format, is convenient to preserve the inherent semantic information contained in video.For resolving the semantic information that contains in video below and social networks provides a kind of architecture basics.
(2) the present invention extracts from multi-angle video semanteme, can obtain the semantic information enriched, and provides abundant content basis for analyzing the social networks contained in video below exactly.
Accompanying drawing explanation
Fig. 1 is method flow diagram of the present invention,
Fig. 2 is Scene Semantics information matrix structural drawing of the present invention.
Embodiment
Specifically implement to be described in more detail to the present invention below in conjunction with Fig. 1.
One, the semanteme of destination object in camera lens is extracted
First destination object detection and classification is carried out to camera lens, foreground object and background object is extracted in camera lens, and their vision and behavioural characteristic, writing time and personage's dialog information, and the semanteme that the proper vector of destination object is analyzed out is saved in the form of key-value pair in the <key> node under the <short> node of XML file.As follows:
Specifically starting with from color, texture and shape tripartite region feature when extracting vision and behavioural characteristic, in XML file, representing by the value of the <key> node of color, shape and texture by name the semantic information obtained from this three aspect respectively; Time is used for distinguishing different camera lenses, also for the structure of semantic sequence is below given a clue, represents with the <key> node of time by name; Here destination object mainly refers to people, represents with the <key> node of object by name; Dialogue in the camera lens <key> node of dialogue by name represents, in node, which destination object this feature node of obj_name attribute representative belongs to.
Be transformed in the <short> node process of XML format file at camera lens, use SVM model and come vision inside training study camera lens sample, behavioural characteristic vector, carry out the classification to destination object, SVM model is used to obtain the proper vector of the camera lens except sample again, and with conversion regime same above conversion, under the semantic information of each camera lens being saved in the <short> node of XML file.
Two, according to shot sequence structure scene
Each scene is made up of the shot sequence that a group is associated.Shot sequence is constructed according to the contact between time series and destination object.And all camera lenses have been converted to <short> node in corresponding XML file above, so, structure scene uses dom4j technology to resolve this group <short> node exactly, be the value that object is corresponding by the name attribute in key node in coupling <short> node file, find out and be worth consistent all <short> nodes with this, the <scene> node that the order increased progressively according to the time is saved in XML file successively gets off to represent a scene of this arrangement of mirrors header sequence and video.As follows:
Three, Scene Semantics is analyzed
Build Scene Semantics with a matrix, be in fact that a n dimension is upright, n is all foreground object numbers comprised in video, this refers to personage's number.The <scene> node of the scene in video by XML file is represented above, use dom4j technology to carry out analyzing XML file on this basis, the content of parsing is saved in constructed matrix.Wherein each element of matrix is a HashMap set, what preserve is key-value pair data in the corresponding XML file of this camera lens under <short> node, the <scene> node information of XML file is resolved in circulation, camera lens semantic informations all for this each scene is saved in corresponding HashMap.
Four, Scene Semantics matrix
Each scene node in analyzing XML file, analyzes all semantic informations that it comprises successively, obtains the semantic information of relation between personage and personage.The semantic information of each scene is saved in successively one by one in matrix.Described Scene Semantics information comprises the semantic information of social networks between personage and personage.The concrete the treatment scheme wherein social networks of personage and the semantic information of personage being saved in a matrix is as follows:
(1) the semantic node in all camera lens nodes under analyzing XML file Scene node, obtains all semantic informations.
(2) from the semantic information obtained, find out the semantic information of personage, set up a matrix according to this.Line number in matrix except diagonal entry and row number that identical two group element, represent the social networks of same person, cornerwise element preserves the semantic information of corresponding personage.
(3) assignment is carried out to entry of a matrix element, analyze all semantic informations of obtaining, obtain the semantic information of social networks between personage and personage, then preserve social networks and the semantic information of corresponding personage successively with HashMap set.Described HashMap set is a data acquisition being used for depositing key-value pair.Last successively by aggregate assignment to the element of the correspondence position of matrix.
Five, video semanteme matrix
The matrix that one represents video semanteme information is obtained according to all matrixes representing Scene Semantics.This matrix preserves semantic information and the social networks of all personages in video.The element of every a line of matrix or each row stores the semantic information of relation between a personage and other personages and this personage, and the line number of each personage in a matrix or row number are specified by user.Wherein each personage line number in a matrix or row number are specified by user.Idiographic flow is as follows:
(1) from the matrix of all Scene Semantics, all personages and each personage corresponding personage's semantic information set HashMap is extracted.Successively union is got in these semantic information set, be merged into a large set, then this large set is saved in the corresponding diagonal entry of matrix.
(2) from the matrix of all Scene Semantics, extract the social networks set HashMap between personage.According to human classification, successively union is got in the social networks set of identical personage, be merged into a large set.According to personage's line number in a matrix or row number, the big collection of the social networks of each personage is preserved corresponding position in a matrix.
Embodiment is set forth further below in conjunction with case.
Suppose there is one section of video, description be have four people waiting for bus at bus platform, represent four personages with A, B, C, D respectively, video only comprises two scenes.
(1), scene one: A laughs at and said sentence to B, and " see a film in the evening together! ", then B laughs at and answers: " dear, you are happy, and what is all right.”
(2), scene two: C and D look at oneself mobile phone, between without any interchange.
This scene has four target persons, so semantic between storing with 4 dimensions are upright, four personage's targets A, B, C, D represent.The semantic information that upright the first row stores A and occurs between other people, below by that analogy.The value of each element upright, deposits be expert at corresponding target person and the corresponding target person of row, the semantic information extracted in video.
Scene one comprises two shot sequences:
1) A laughs at and has said sentence to B, and " see a film in the evening together! "
2) B laughs at and answers: " dear, you are happy, and what is all right.”
Scene two comprises a camera lens: 1) C and D is seeing the mobile phone, and never exchanges.
In scene one, to convert XML file to as follows for camera lens:
Scene Semantics information matrix structure as shown in Figure 2.
Social networks is analyzed:
(1) A and B relation is very close.
(2) C and D relation is strange.

Claims (1)

1., based on a destination object social networks recognition methods for video semanteme, it is characterized in that the step that the method comprises is:
Step 1) first pre-service is carried out to the video data of user's input, concrete treatment scheme is as follows:
Step 1.1) utilize block-based comparative approach to split video data, obtain the camera lens of this video data, described block-based comparative approach is the region unit image of each frame of video data being divided into user's specified quantity, different camera lenses is marked off by the similarity comparing the region unit between successive frame, wherein frame is a least unit i.e. two field picture of video data, camera lens is one group of continuous print frame sequence in video, the characteristic sum specific standards of the similarity of described region unit is specified by user, region unit between the successive frame of same camera lens has similarity,
Step 1.2) from each camera lens, extraction is in that frame in this camera lens frame sequence centre position as key frame successively, and this key frame represents this camera lens in subsequent treatment;
Step 2) extract the semanteme collection of the destination object that in all key frames, user specifies, form semanteme collection being converted to key-value pair is saved in the file of XML format; Described destination object comprises background object and foreground object two class, and foreground object is who object, and background object is place residing for personage, temporal information; Described semanteme collection is the set of the semantic information that in video, destination object extracts, and comprises the semanteme of background, time, dialogue, personage, color, shape, texture; The file of described XML format comprises 3 layers of nested node, ground floor is scene node, use <scene> labeled marker, described scene refers to the arrangement of mirrors header sequence according to the sequential relationship composition between the semantic information of camera lens and camera lens; The second layer is camera lens node, uses <short> labeled marker; Third layer is concrete semantic node, uses <key> labeled marker; The concrete treatment scheme extracting the semanteme collection of the destination object that in each key frame, user specifies is as follows:
Step 2.1) key frame is carried out to detection and the classification of destination object, extract all destination objects that this key frame packet contains, record the time point that dialog information in this key frame between personage and this key frame are arranged in the broadcasting of video simultaneously;
Step 2.2) extract the visual signature of all prospects of key frame and background object, form corresponding proper vector, the visual signature of described background object comprises color, texture; The visual signature of foreground object comprises color, texture, shape;
Step 2.3) with SVM, the proper vector of destination object in key frame is learnt, extract the semantic information of foreground object and background object; The semantic information of described foreground object is the semantic information of the visual behaviour performance of foreground object, comprises color, shape, texture, personage, dialogue; The semantic information that described background object is got is the environment semantic information residing for background object, comprises background, time, and described SVM is a kind of learning model having supervision;
Step 2.4) by the foreground object of key frame of acquisition and the semantic information of background object, be saved in the camera lens node of XML file according to the form of key-value pair under;
Step 3) analyzing step 2) node lower time of each camera lens of obtaining and the semantic node corresponding to personage, the camera lens node having the semantic node of identical personage is classified as an arrangement of mirrors head node; The semantic node of described personage be exactly in XML file under <short> node in <key> node name attribute be that key-value pair of personage;
Step 4) data of every arrangement of mirrors head node of having classified are called according to name the incremental order of the nodal value of time node is saved in the scene node of XML file under, construct camera lens semantic sequence successively, represent scene one by one;
Step 5) each scene node in analyzing XML file successively, analyze all semantic informations that it comprises, obtain the semantic information of relation between personage and personage, these information of each scene are saved in successively one by one in matrix, the element of every a line of these matrixes or each row stores the semantic information of relation between a personage and other personages and this personage, and the line number of each personage in a matrix or row number are specified by user; Described Scene Semantics information comprises the semantic information of social networks between personage and personage, and the concrete the treatment scheme wherein social networks of personage and the semantic information of personage being saved in a matrix is as follows:
Step 5.1) extract under XML file Scene node all camera lens nodes in semantic node, obtain the semantic information that this scene is all;
Step 5.2) from obtaining the semantic information finding out personage all semantic informations of scene, set up a matrix according to this, in matrix except diagonal entry, when line number and a column element of a row element arranges number identical, this row element and a column element represent the social networks of same personage, and cornerwise element preserves the semantic information of this personage; The line number of described cornerwise element and row are number identical;
Step 5.3) assignment is carried out to the element of scene homography, from step 5.1) all semantic informations of obtaining, social networks between extraction personage and the semantic information of personage, social networks and the semantic information of personage is preserved again successively, by aggregate assignment to the element of the correspondence position of matrix with set HashMap; Described set HashMap is a data acquisition being used for depositing key-value pair;
Step 6) obtain according to all matrixes representing Scene Semantics the matrix that represents video semanteme information; This matrix preserves semantic information and the social networks of all personages in video, the element of every a line of matrix or each row stores the semantic information of relation between a personage and other personages and this personage, the line number of each personage in a matrix or row number are specified by user, wherein each personage line number in a matrix or row number are specified by user, and idiographic flow is as follows:
Step 6.1) from the matrix of all Scene Semantics, extract all personages and each personage corresponding personage's semantic information set HashMap, successively union is got in these semantic information set, merge and be saved in a HashMap set, then the HashMap set after this merging is saved in the corresponding diagonal entry of matrix;
Step 6.2) from the matrix of all Scene Semantics, extract social networks between personage, according to personage's line number in a matrix or row number, successively union is got in the social networks set of identical personage, merge and be saved in a HashMap set, then the HashMap set after this merging is saved in each personage position in a matrix.
CN201510137760.1A 2015-03-26 2015-03-26 A kind of destination object social networks recognition methods based on video semanteme Expired - Fee Related CN104778224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510137760.1A CN104778224B (en) 2015-03-26 2015-03-26 A kind of destination object social networks recognition methods based on video semanteme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510137760.1A CN104778224B (en) 2015-03-26 2015-03-26 A kind of destination object social networks recognition methods based on video semanteme

Publications (2)

Publication Number Publication Date
CN104778224A true CN104778224A (en) 2015-07-15
CN104778224B CN104778224B (en) 2017-11-14

Family

ID=53619688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510137760.1A Expired - Fee Related CN104778224B (en) 2015-03-26 2015-03-26 A kind of destination object social networks recognition methods based on video semanteme

Country Status (1)

Country Link
CN (1) CN104778224B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294783A (en) * 2016-08-12 2017-01-04 乐视控股(北京)有限公司 A kind of video recommendation method and device
CN106778537A (en) * 2016-11-28 2017-05-31 中国科学院心理研究所 A kind of collection of animal social network structure and analysis system and its method based on image procossing
CN107909038A (en) * 2017-11-16 2018-04-13 北京邮电大学 A kind of social networks disaggregated model training method, device, electronic equipment and medium
CN107992598A (en) * 2017-12-13 2018-05-04 北京航空航天大学 A kind of method that colony's social networks excavation is carried out based on video data
CN109344285A (en) * 2018-09-11 2019-02-15 武汉魅瞳科技有限公司 A kind of video map construction and method for digging, equipment towards monitoring
CN109471959A (en) * 2018-06-15 2019-03-15 中山大学 Personage's social relationships discrimination method and system in image based on figure inference pattern
CN110070438A (en) * 2019-04-25 2019-07-30 上海掌门科技有限公司 A kind of credit score calculation method, equipment and storage medium
WO2019144840A1 (en) * 2018-01-25 2019-08-01 北京一览科技有限公司 Method and apparatus for acquiring video semantic information
CN110765314A (en) * 2019-10-21 2020-02-07 长沙品先信息技术有限公司 Video semantic structural extraction and labeling method
CN112132118A (en) * 2020-11-23 2020-12-25 北京世纪好未来教育科技有限公司 Character relation recognition method and device, electronic equipment and computer storage medium
CN114493905A (en) * 2020-11-13 2022-05-13 四川大学 Social relationship identification method based on multilevel feature fusion
CN117676187A (en) * 2023-04-18 2024-03-08 德联易控科技(北京)有限公司 Video data processing method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332558A1 (en) * 2005-05-03 2010-12-30 Comcast Cable Communications, Llc Verification of Semantic Constraints in Multimedia Data and in its Announcement, Signaling and Interchange
CN102663095A (en) * 2012-04-11 2012-09-12 北京中科希望软件股份有限公司 Method and system for carrying out semantic description on audio and video contents

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332558A1 (en) * 2005-05-03 2010-12-30 Comcast Cable Communications, Llc Verification of Semantic Constraints in Multimedia Data and in its Announcement, Signaling and Interchange
CN102663095A (en) * 2012-04-11 2012-09-12 北京中科希望软件股份有限公司 Method and system for carrying out semantic description on audio and video contents

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BAI L等: "Video semantic content analysis based on ontology", 《IEEE》 *
朱华宇等: "基于MPEG_7的视频语义描述方法", 《南京大学学报(自然科学)》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294783A (en) * 2016-08-12 2017-01-04 乐视控股(北京)有限公司 A kind of video recommendation method and device
CN106778537A (en) * 2016-11-28 2017-05-31 中国科学院心理研究所 A kind of collection of animal social network structure and analysis system and its method based on image procossing
CN107909038B (en) * 2017-11-16 2022-01-28 北京邮电大学 Social relationship classification model training method and device, electronic equipment and medium
CN107909038A (en) * 2017-11-16 2018-04-13 北京邮电大学 A kind of social networks disaggregated model training method, device, electronic equipment and medium
CN107992598A (en) * 2017-12-13 2018-05-04 北京航空航天大学 A kind of method that colony's social networks excavation is carried out based on video data
CN107992598B (en) * 2017-12-13 2022-03-15 北京航空航天大学 Method for mining social relation of group based on video material
WO2019144840A1 (en) * 2018-01-25 2019-08-01 北京一览科技有限公司 Method and apparatus for acquiring video semantic information
CN109471959A (en) * 2018-06-15 2019-03-15 中山大学 Personage's social relationships discrimination method and system in image based on figure inference pattern
CN109471959B (en) * 2018-06-15 2022-06-14 中山大学 Figure reasoning model-based method and system for identifying social relationship of people in image
CN109344285A (en) * 2018-09-11 2019-02-15 武汉魅瞳科技有限公司 A kind of video map construction and method for digging, equipment towards monitoring
CN109344285B (en) * 2018-09-11 2020-08-07 武汉魅瞳科技有限公司 Monitoring-oriented video map construction and mining method and equipment
CN110070438A (en) * 2019-04-25 2019-07-30 上海掌门科技有限公司 A kind of credit score calculation method, equipment and storage medium
CN110765314A (en) * 2019-10-21 2020-02-07 长沙品先信息技术有限公司 Video semantic structural extraction and labeling method
CN114493905A (en) * 2020-11-13 2022-05-13 四川大学 Social relationship identification method based on multilevel feature fusion
CN114493905B (en) * 2020-11-13 2023-04-07 四川大学 Social relationship identification method based on multilevel feature fusion
CN112132118A (en) * 2020-11-23 2020-12-25 北京世纪好未来教育科技有限公司 Character relation recognition method and device, electronic equipment and computer storage medium
CN112132118B (en) * 2020-11-23 2021-03-12 北京世纪好未来教育科技有限公司 Character relation recognition method and device, electronic equipment and computer storage medium
CN117676187A (en) * 2023-04-18 2024-03-08 德联易控科技(北京)有限公司 Video data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN104778224B (en) 2017-11-14

Similar Documents

Publication Publication Date Title
CN104778224A (en) Target object social relation identification method based on video semantics
CN112199375B (en) Cross-modal data processing method and device, storage medium and electronic device
Zhou et al. Contextual ensemble network for semantic segmentation
CN109344285B (en) Monitoring-oriented video map construction and mining method and equipment
Blasch et al. Wide-area motion imagery (WAMI) exploitation tools for enhanced situation awareness
CN110750656A (en) Multimedia detection method based on knowledge graph
CN113705218B (en) Event element gridding extraction method based on character embedding, storage medium and electronic device
CN112633431B (en) Tibetan-Chinese bilingual scene character recognition method based on CRNN and CTC
Ke et al. Video mask transfiner for high-quality video instance segmentation
Chen et al. A temporal attentive approach for video-based pedestrian attribute recognition
CN111177559A (en) Text travel service recommendation method and device, electronic equipment and storage medium
CN116955707A (en) Content tag determination method, device, equipment, medium and program product
Lv et al. Storyrolenet: Social network construction of role relationship in video
Wang et al. Deep multi-person kinship matching and recognition for family photos
CN113343941A (en) Zero sample action identification method and system based on mutual information similarity
CN113657272B (en) Micro video classification method and system based on missing data completion
CN113901228B (en) Cross-border national text classification method and device fusing domain knowledge graph
Luo et al. An optimization framework of video advertising: using deep learning algorithm based on global image information
Saleem et al. Stateful human-centered visual captioning system to aid video surveillance
CN110674265A (en) Unstructured information oriented feature discrimination and information recommendation system
CN112306985A (en) Digital retina multi-modal feature combined accurate retrieval method
CN114998809A (en) False news detection method and system based on ALBERT and multi-mode cycle fusion
Li et al. Short text sentiment analysis based on convolutional neural network
CN115168609A (en) Text matching method and device, computer equipment and storage medium
Qu et al. Video visual relation detection via 3d convolutional neural network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20181114

Address after: Room 929, 102 Chaoyang North Road, Chaoyang District, Beijing 100123

Co-patentee after: Li Bo

Patentee after: Fan Liyang

Address before: 210046 9 Wen Yuan Road, Ya Dong new town, Qixia District, Nanjing, Jiangsu.

Patentee before: Nanjing Post & Telecommunication Univ.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20171114

Termination date: 20190326