WO2024031933A1 - 一种基于多模态数据的社交关系分析方法、***和存储介质 - Google Patents

一种基于多模态数据的社交关系分析方法、***和存储介质 Download PDF

Info

Publication number
WO2024031933A1
WO2024031933A1 PCT/CN2023/072957 CN2023072957W WO2024031933A1 WO 2024031933 A1 WO2024031933 A1 WO 2024031933A1 CN 2023072957 W CN2023072957 W CN 2023072957W WO 2024031933 A1 WO2024031933 A1 WO 2024031933A1
Authority
WO
WIPO (PCT)
Prior art keywords
social
text
features
image
modal
Prior art date
Application number
PCT/CN2023/072957
Other languages
English (en)
French (fr)
Inventor
陈思萌
赵建强
陈诚
彭闯
张辉
韩名羲
Original Assignee
厦门市美亚柏科信息股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 厦门市美亚柏科信息股份有限公司 filed Critical 厦门市美亚柏科信息股份有限公司
Publication of WO2024031933A1 publication Critical patent/WO2024031933A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features

Definitions

  • This application belongs to the technical field of social relationship analysis, and specifically relates to a social relationship analysis method, system and storage medium based on multi-modal data.
  • Social network is also a network of relationships between people.
  • Prediction methods for social relationship links are mainly divided into two categories, one is based on similarity measurement, and the other is based on probability graph model.
  • the social relationship mining method based on similarity measurement predicts relationship links by calculating the similarity between nodes. The higher the similarity, the greater the possibility of generating links.
  • Liben-Nowell et al. used an unsupervised learning method [Liben-Nowell D, Kleinberg J.The link-prediction problem for social networks[J]. Journal of the American society for information science and technology, 2007, 58(7): 1019-1031], using the similarity of content and structure published by users, calculating the similarity between node pairs based on the homogeneity principle of network nodes, and performing link prediction of social relationships. Lichtenwalter et al.
  • the social relationship mining method based on the probabilistic graph model uses the Bayesian graph model to model the joint probability between nodes.
  • Wang et al proposed a local probabilistic graphical model [Wang C, Satuluri V, Parthasarathy S.Local probabilistic models for link prediction[C]//Seventh IEEE international conference on data mining (ICDM 2007). IEEE, 2007: 322- 331], discover hidden social relationships by estimating the joint co-occurrence probability of pairs of network nodes.
  • this application proposes a social relationship analysis method based on multi-modal data, which includes the following steps:
  • S1 extract the social text and social image information of the person, convert it into text features and image features respectively, and count the closeness of the person, and build the social network graph of the person based on the closeness of the person;
  • the Si-SCAN graph clustering algorithm is constructed by introducing personnel intimacy and fusion feature information based on the SCAN algorithm.
  • the above solution conducts in-depth analysis of social relationships based on information in two modalities: text and image.
  • a multi-modal information fusion model learns the interactive relationships between cross-modalities and generates multi-modal fusion graph node embedding representations.
  • graph clustering analysis deep relationship analysis of social networks can be realized, and potential social correlations can be effectively discovered.
  • S2 includes: an encoder that splices text features and image features and then inputs them into the multi-modal fusion model.
  • an encoder that splices text features and image features and then inputs them into the multi-modal fusion model.
  • Implementing information interaction between different modes in a single-flow model can reduce the amount of calculation.
  • S2 also includes:
  • Construct text-image feature pair Z 0 [CLS, E, SEP, Q], where E is the text feature of person i, Q is the image feature of person i, [CLS] is the identifier, and [SEP] is the delimiter ;
  • the output vector of the identifier position of the last layer encoder is selected as the fusion feature vector z.
  • the above solution combines the two modal information of image and text to design a multi-modal fusion model based on transformer, conducts in-depth interaction and fusion of the two modal data, and generates the embedded representation of nodes in subsequent graph clustering analysis.
  • the accuracy of subsequent graph clustering analysis results can be improved.
  • S1 constructs a personnel social network graph including using persons as nodes of the personnel social network graph, and the nodes corresponding to two persons appearing in the same social image are connected by an undirected edge.
  • the node similarity in the Si-SCAN graph clustering algorithm in S3 includes structural similarity, personal intimacy and fusion feature similarity.
  • the structural similarity is the ratio of the number of common neighbors of two nodes to the geometric mean, expressed by the formula:
  • ⁇ 1 (v, w) is the structural similarity between node v and node w
  • ⁇ (v) and ⁇ (w) are the set of neighbor nodes of nodes v and w respectively;
  • ⁇ 2 (v, w) is the personal closeness of node v and node w
  • p (v, w) is the number of co-occurrences of the two nodes in the social image
  • is the adjustment coefficient
  • the fusion feature similarity is expressed as:
  • ⁇ 3 (v, w) is the fusion feature similarity of node v and node w
  • z v and z w are the fusion feature vectors of nodes v and w respectively;
  • node similarity ⁇ (v, w) of node v and node w is expressed by the formula:
  • ⁇ (v, w) ⁇ 1 (v, w) + ⁇ 2 (v, w) + ⁇ 3 (v, w).
  • S1 uses the BiLSTM-CRF model to extract text tag information from social texts; it uses a word vector model to convert the text tag information into text features.
  • S1 uses the FACENet model to extract image features.
  • this application proposes a social relationship analysis system based on multi-modal data, including:
  • the information extraction and feature conversion module is configured to extract people's social text and social image information, convert them into text features and image features respectively, and count people's closeness, and build a social network graph of people based on their closeness;
  • the multi-modal data fusion module is configured to input text features and image features into a transformer-based multi-modal fusion model to obtain fusion features;
  • the social relationship clustering module is configured to use the Si-SCAN graph clustering algorithm to analyze the personnel social network graph and obtain social relationship clustering results.
  • the Si-SCAN graph clustering algorithm introduces the degree of personal intimacy and Fusion feature information construction.
  • the present application proposes a computer-readable storage medium for social relationship analysis based on multi-modal data.
  • One or more computer programs are stored thereon.
  • the one or more computer programs are executed by a computer processor, Implement any of the above methods.
  • This invention proposes a social relationship mining technology framework that combines multiple technical methods such as named entity recognition, face recognition, multi-modal fusion, and graph clustering.
  • the methods used in each module in this framework are scalable, It is replaceable and can be flexibly used in other relationship mining scenarios.
  • a multi-modal fusion model based on transformer, different modal information is used to mine social relationships between people, and different modal features are complementary and fused to make up for the lack of information in a single modality, reduce information redundancy, and enable effective learning.
  • Figure 1 is a schematic flow chart of a social relationship analysis method based on multi-modal data in an embodiment
  • Figure 2 is a framework diagram of social relationship analysis based on multi-modality in another embodiment
  • Figure 3 is a model structure of the named entity recognition model BiLSTM plus conditional random field (CRF) in another embodiment
  • Figure 4 is a schematic structural diagram of a facial feature extraction model in another embodiment
  • Figure 5 is a schematic structural diagram of a transformer multi-modal fusion model in another embodiment
  • Figure 6 is a schematic structural diagram of a social relationship analysis system based on multi-modal data in another embodiment
  • FIG. 7 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application in another embodiment.
  • Figure 1 is a schematic flowchart of a social relationship analysis method based on multi-modal data in an embodiment, which specifically includes:
  • S1 extract the social text and social image information of the person, convert it into text features and image features respectively, and count the closeness of the person, and build the social network graph of the person based on the closeness of the person;
  • Si-SCAN graph clustering algorithm uses the Si-SCAN graph clustering algorithm to analyze the personnel social network graph and obtain social relationship clustering results.
  • the Si-SCAN graph clustering algorithm introduces personnel intimacy and fusion feature information based on the SCAN algorithm.
  • Figure 2 is a framework diagram of social relationship analysis based on multi-modality in an embodiment.
  • the program is mainly divided into three parts:
  • the analysis process of the social relationship analysis method based on multi-modal data includes:
  • Text data labels Various types of social information related to people are collected in social networks. For ease of expression, all people in the collected data are numbered from 1,...,N, and each number corresponds to one person. First, all text content related to the user is extracted, and key text information is extracted from four aspects: education background, living area, career direction, and hobbies, including academic qualifications, school, work place, birthplace, work industry, job position, and skill tags , interest tags and eight kinds of information to describe user characteristics.
  • NER Named entity recognition technology
  • Figure 3 is a schematic diagram of the model structure of the classic named entity recognition model BiLSTM plus conditional random field (CRF) selected in this embodiment. Convert each word in the text into word feature c, use random initialization or pre-trained model, and output the prediction label of each word through the NER model.
  • BIO sequence annotation is selected.
  • the entity information corresponding to each type of label shall not exceed 8. If insufficient, use the character [UNK] to complete it, and 64 entity label information can be obtained.
  • Extract facial features Convert group photo information into image features. After deduplicating all the images, facial recognition technology is used to extract the facial features of each person in the group photo.
  • FIG. 1 is a schematic structural diagram of the facial feature extraction model selected in this embodiment.
  • This embodiment combines image and text modal information to design a transformer-based multi-modal fusion model, conducts in-depth interaction and fusion of the two modal data, and generates an embedded representation of the node.
  • text and image features are spliced and then input into the transformer encoder to achieve information interaction between different modalities in a single-stream model.
  • Figure 5 is a schematic structural diagram of the transformer multi-modal fusion model in this embodiment. For any person i, construct a text-image feature pair
  • E is the text feature of person i
  • Q is the image feature of person i
  • [CLS] is the identifier
  • [SEP] is the separator.
  • the position encoding of text and image is constructed.
  • the text position encoding represents the embedding of the input position of the 8 entity label sequence information.
  • the image position encoding represents the embedding of each image number in the image data set.
  • the text features and image features are added to the text position coding and image position coding respectively, and are input into the transformer encoder for interactive learning in the two modes of text and image.
  • the correlation information between facial features and text features in different scenarios is learned.
  • the output vector of the dth layer of the encoder is
  • the output vector of the last layer encoder identifier [CLS] position is selected as the multi-modal fusion feature representation z.
  • the hidden layer size is set to 768
  • the number of multi-head attention heads is 12, and the number of transformer layers is 6.
  • the multi-modal feature fusion model is trained using the image-text alignment task (Image-TextMatching, ITM). Randomly replace the input image-text pairs, and then use the model to predict whether there is a correspondence between the input image and text, and continuously optimize the model.
  • the graph clustering method is used to divide the social network graph and mine the social relationships between people.
  • This embodiment designs a Si-SCAN clustering method, in which node similarity includes structural similarity, personal intimacy and fusion feature similarity.
  • Structural similarity is the ratio of the number of common neighbors of two nodes to the geometric mean, expressed as:
  • ⁇ 1 (v, w) is the structural similarity between node v and node w
  • ⁇ (v) and ⁇ (w) are the set of neighbor nodes of nodes v and w respectively;
  • ⁇ 2 (v, w) is the personal intimacy of node v and node w
  • p (v, w) is the number of co-occurrences of the two nodes in the social image
  • is the adjustment coefficient, in this embodiment it is set to 0.1;
  • the fusion feature similarity is expressed as:
  • ⁇ 3 (v, w) is the fusion feature similarity of node v and node w
  • z v and z w are the fusion feature vectors of nodes v and w respectively;
  • node similarity ⁇ (v, w) of node v and node w is expressed by the formula:
  • the core node is a first node.
  • Si-SCAN is the same as the SCAN algorithm. It clusters all nodes in the social network graph by calculating neighbor nodes and core nodes in the graph.
  • Si-SCAN algorithm execution steps are as follows:
  • v For each unassigned node v, determine whether v belongs to the core node according to the similarity definition. If it is not a core node, it will be marked as a non-member. If it is a core node, a new cluster will be expanded and it will be directly reachable. The nodes marked as unallocated and non-member nodes are allocated to the cluster, and the process is repeated until each node is traversed;
  • Figure 6 is a schematic diagram of the structure 600 of a social relationship analysis system based on multi-modal data in another embodiment of the present application, including:
  • the information extraction and feature conversion module 601 is configured to extract a person's social text and social image information, convert them into text features and image features respectively, and count the person's closeness, and build a social network graph of the person based on the person's closeness;
  • the multimodal data fusion module 602 is configured to input text features and image features into a transformer-based multimodal fusion model to obtain fusion features;
  • the social relationship clustering module 603 is configured to use the Si-SCAN graph clustering algorithm to analyze the personnel social network graph and obtain social relationship clustering results.
  • the Si-SCAN graph clustering algorithm introduces the degree of personal intimacy and Fusion of feature information.
  • FIG. 7 shows a schematic structural diagram of a computer system 700 suitable for implementing an electronic device according to an embodiment of the present application.
  • the electronic device is only an example and should not impose any restrictions on the functions and usage scope of the embodiments of the present application.
  • computer system 700 includes a central processing unit (CPU) 701, which can be loaded into a random access memory (RAM) according to a program stored in a read-only memory (ROM) 702 or from a storage portion 708.
  • the program in 703 performs various appropriate actions and processing.
  • various programs and data required for the operation of the system 700 are also stored.
  • the CPU 701, ROM 702, and RAM 703 are connected to each other through a bus 704.
  • An input/output (I/O) interface 705 is also connected to bus 704.
  • the following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, etc.; an output section 707 including a liquid crystal display (LCD), etc., speakers, etc.; a storage section 708 including a hard disk, etc.; and including a LAN card, Communication section 709 of a network interface card such as a modem.
  • the communication section 709 performs communication processing via a network such as the Internet.
  • Driver 710 is also connected to I/O interface 705 as needed.
  • Removable media 711 such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, etc., are installed on the drive 710 as needed, so that a computer program read therefrom is installed into the storage portion 708 as needed.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable storage medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via communication portion 709 and/or installed from removable media 711 .
  • the computer program is executed by the central processing unit (CPU) 701
  • the above functions defined in the method of the present application are performed.
  • the computer-readable storage medium of the present application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium may be, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination thereof. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmd read-only memory (EPROM or flash memory), fiber optics, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. As used herein, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable storage medium other than a computer-readable storage medium that may be sent, propagated, or transmitted for use by or in connection with an instruction execution system, apparatus, or device program of.
  • Program code embodied on a computer-readable storage medium may be transmitted using any suitable medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional procedures, or a combination thereof.
  • programming language - such as "C” or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as an Internet service provider through Internet connection
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • this application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be included in the electronic device described in the above embodiments; it may also exist independently without being assembled into the electronic device. in the device.
  • the computer-readable storage medium carries one or more programs. When the one or more programs are executed by the electronic device, the electronic device: extracts the person's social text and social image information, and converts them into text features and images respectively. Features, and statistics of personal closeness, construct a social network graph of people based on personal closeness; input text features and image features into a transformer-based multi-modal fusion model to obtain fusion features; use the Si-SCAN graph clustering algorithm to analyze the social network of people The graph is analyzed to obtain social relationship clustering results.
  • the Si-SCAN graph clustering algorithm introduces personal intimacy and fusion feature information based on the SCAN algorithm.
  • a deep social relationship analysis method and system based on multi-modal data fusion is designed, making full use of text and image information in social networks to more comprehensively and accurately depict potential social relationships between different people, and Evaluating relationship strength can effectively mine potential social connections in the network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提出一种基于多模态数据的社交关系分析方法,包括:S1,提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于人员亲密度构建人员社交网络图;S2,将文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;S3,采用Si-SCAN图聚美算法对人员社交网络图进行分析,获得社交关系聚类结果,其中,Si-SCAN图聚类算法通过在SCAN算法基础上引入人员亲密度和融合特征信息构建。本发明基于文本、图像两个模态的信息对社交关系进行深入分析,通过多模态信息融合模型的设计,学习跨模态间的交互关系,生成多模态融合的图节点嵌入表征。通过图聚类分析,实现对社交网络的深层关系分析,能够有效发现潜在的社交关联。

Description

一种基于多模态数据的社交关系分析方法、***和存储介质
本PCT申请要求于2022年08月12日提交的申请号为CN202210971424.7的中国在先申请的优先权,在此通过引用将该中国在先申请的全部内容并入本文。
技术领域
本申请属于社交关系分析技术领域,具体的涉及一种基于多模态数据的社交关系分析方法、***和存储介质。
背景技术
随着互联网技术的快速发展,网络社交应用与媒体也迅速扩散开来,人们的日常生活与各类社交信息变得愈发密不可分。社交网络,也是人与人之间的一种关系网络,通过对社交网络中各类传播信息的分析,能够发现用户之间的社交关系。对社交关系链接的预测方法主要分两类,一类是基于相似性度量的方法,一类是基于概率图模型的方法。
基于相似性度量的社交关系挖掘方法是通过计算节点间的相似度进行关系链接的预测,相似度越高的节点,产生链接的可能性也越大。Liben-Nowell等人采用无监督学习的方法[Liben-Nowell D,Kleinberg J.The link-prediction problem for social networks[J].Journal of the American society for information science and technology,2007,58(7):1019-1031],利用用户所发布内容、结构的相似性,基于网络节点的同质性原则计算节点对之间的相似度,进行社交关系的链接预测。Lichtenwalter等人[Lichtenwalter R N,Chawla N V.Vertex collocation profiles:subgraph counting for link analysis and prediction[C]//Proceedings of the 21st international conference on World Wide Web.2012:1019-1028]基于顶点排列轮廓(VCP)的概念,将节点对周围的局部结构信息考虑在内,进行了拓扑链接分析与预测。而De等人[De A,Ganguly N,Chakrabarti S.Discriminative link prediction using local links,node features and community structure[C]//2013 IEEE 13th International Conference on Data Mining.IEEE,2013:1009-1018]结合了全局属性、局部属性以及社团中间层的连接密度,从多个维度对社交关系进行链接预测。
基于概率图模型的社交关系挖掘方法是利用贝叶斯图模型对节点间的联合概率进行建模,然而直接采用贝叶斯图模型难以挖掘出隐藏在社交网络中的复杂关系。Wang等人提出了一种局部概率图模型[Wang C,Satuluri V,Parthasarathy S.Local probabilistic models for link prediction[C]//Seventh IEEE international conference on data mining(ICDM 2007).IEEE,2007:322-331],通过估计网络节点对的联合同现概率来发现隐藏的社交关系。Zhu等人[Zhu Y,Yan X,Getoor L,et al.Scalable text and link analysis with mixed-topic link models[C]//Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining.2013:473-481]基于主题模型的思想,结合混合成员分块模型提出了一种混合主题链接模型,实现了无监督学习的主题 分类以及对社交关系的关联预测。Zhang等人[Zhang J,Wang C,Yu P S,et al.Learning latent friendship propagation networks with interest awareness for link prediction[C]//Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval.2013:63-72]提出了一种潜在友谊传播网络(LFPN),用来描述以个人为中心的网络演化发展,并使用潜在友谊传播网络对个人的社会行为进行建模,将链接信息作为交友与兴趣相结合的结果。
上述现有的社交关系分析方法通常利用社交网络中的文本信息结合网络结构进行分析,再对文本与结构特征进行关系度量,发现网络中的社交关系。而图像信息等社交信息未被充分利用,在社交人物的关系构建上缺少多维度的信息刻画。
发明内容
针对上述问题,本申请提出一种基于多模态数据的社交关系分析方法,包括以下步骤:
S1,提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于人员亲密度构建人员社交网络图;
S2,将文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;
S3,采用Si-SCAN图聚类算法对人员社交网络图进行分析,获得社交关系聚类结果,其中,Si-SCAN图聚类算法通过在SCAN算法基础上引入人员亲密度和融合特征信息构建。
上述方案基于文本、图像两个模态的信息对社交关系进行深入分析,通过多模态信息融合模型的设计,学习跨模态间的交互关系,生成多模态融合的图节点嵌入表征。通过图聚类分析,实现对社交网络的深层关系分析,能够有效发现潜在的社交关联。
优选地,S2包括:将文本特征和图像特征拼接后输入多模态融合模型的编码器。以单流模型的方式实现不同模态间的信息交互,可以减少计算量。
优选地,S2还包括:
构建文本-图像特征对Z0=[CLS,E,SEP,Q],其中,E为人员i的文本特征,Q为人员i的图像特征,[CLS]为标识符,[SEP]为分隔符;
构建文本位置编码和图像位置编码;
将文本特征和文本位置编码、图像特征和图像位置编码分别相加,输入到编码器;
选用最后一层编码器的标识符位置的输出向量作为融合特征向量z。
上述方案结合图像、文本两个模态信息,设计一种基于transformer的多模态融合模型,对两个模态数据进行深入的交互与融合,生成后续图聚类分析中节点的嵌入表示。借助多模态信息的互补学习,可以提升后续图聚类分析结果的准确性。
进一步地,S1构建人员社交网络图包括,以人员作为人员社交网络图的节点,在同一张社交图像中出现的两个人员所对应的节点间由一条无向边连接。
优选地,S3中Si-SCAN图聚类算法中节点相似度包括结构相似度、人员亲密度和融合特征相似度。
进一步地,结构相似度为两节点的共同邻居数与几何平均数的比值,用公式表示为:
其中,σ1(v,w)为节点v和节点w的结构相似度,Γ(v)、Γ(w)分别为节点v、w的邻居节点的集合;
人员亲密度用公式表示为:
σ2(v,w)=α·p(v,w)
其中,σ2(v,w)为节点v和节点w的人员亲密度,p(v,w)为两个节点在社交图像中的共现次数,α为调节系数;
融合特征相似度用公式表示为:
其中,σ3(v,w)为节点v和节点w的融合特征相似度,zv、zw分别为节点v、w的融合特征向量;
节点v和节点w的节点相似度σ(v,w)用公式表示为:
σ(v,w)=σ1(v,w)+σ2(v,w)+σ3(v,w)。
上述方案设计了一种Si-SCAN聚类方法,在SCAN[Xu X,Yuruk N,Feng Z,et al.Scan:a structural clustering algorithm for networks[C]//Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining.2007:824-833]的基础上引入人员亲密度与多模态特征的度量信息,不仅考虑到网络节点的结构数据,同时引入多模态融合信息与图像关联信息,综合多个维度设计了一种节点相似度度量方法,能够更好的衡量图数据结构中节点特性与局部结构的关系,从而优化聚类结果。
优选地,S1采用BiLSTM-CRF模型从社交文本中提取文本标签信息;采用词向量模型将文本标签信息转换为文本特征。
优选地,S1选用FACENet模型提取图像特征。
第二方面,本申请提出一种基于多模态数据的社交关系分析***,包括:
信息提取和特征转换模块,配置用于提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于人员亲密度构建人员社交网络图;
多模态数据融合模块,配置用于将文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;
社交关系聚类模块,配置用于采用Si-SCAN图聚类算法对人员社交网络图进行分析,获得社交关系聚类结果,Si-SCAN图聚类算法通过在SCAN算法基础上引入人员亲密度和融合特征信息构建。
第三方面,本申请提出一种用于基于多模态数据的社交关系分析的计算机可读存储介质,其上存储有一或多个计算机程序,该一或多个计算机程序被计算机处理器执行时实施上述任一项方法。
本发明提出的结合命名实体识别、人脸识别、多模态融合、图聚类等多个技术方法,搭建了一种社交关系挖掘技术框架,该框架中各模块所采用的方法是可扩展、可替换的,可灵活运用于其他关系挖掘场景。具体地,通过设计基于transformer的多模态融合模型,利用不同模态信息挖掘人员间的社交关系,对不同模态特征互补融合,弥补单一模态的信息缺失,减少信息冗余,能够有效学习到多模态融合特征的表示;通过设计Si-SCAN图聚类方法,在SCAN算法的基础上结合结构相似度、多模态特征相似度、人员亲密度,充分利用多模态、多维度信息优化相似度量方法,从而提升模型的聚类结果,有助于挖掘网络中的潜在社交关联。
附图说明
附图帮助进一步理解本申请。附图的元件不一定是相互按照比例的。为了便于描述,附图中仅示出了与本发明相关的部分。
图1为一实施例中基于多模态数据的社交关系分析方法流程示意图;
图2为另一实施例中基于多模态的社交关系分析框架图;
图3为另一实施例中命名实体识别模型BiLSTM加条件随机场(CRF)的模型结构;
图4为另一实施例中人脸特征提取模型结构示意图;
图5为另一实施例中transformer多模态融合模型结构示意图;
图6为另一实施例中基于多模态数据的社交关系分析***结构示意图;
图7为另一实施例中适于用来实现本申请实施例的电子设备的计算机***结构示意图。
具体实施方式
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅用于解释相关发明,而非对该发明的限定。
图1为一实施例中一种基于多模态数据的社交关系分析方法流程示意图,具体包括:
S1,提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于人员亲密度构建人员社交网络图;
S2,将文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;
S3,采用Si-SCAN图聚类算法对人员社交网络图进行分析,获得社交关系聚类结果,Si-SCAN图聚类算法在SCAN算法基础上引入人员亲密度和融合特征信息。
图2为一实施例中基于多模态的社交关系分析框架图。该方案主要分为三部分:
1、社交信息采集。借助命名实体识别和人脸识别技术自动化提取社交网络中的文本与图像信息,并利用词向量模型和人脸识别算法将两个模态的信息转换为特征向量。
2、图节点的嵌入表示。通过设计基于transformer的多模态融合模型,交互学习不同模态间的信息输出多模态融合特征,作为图节点的嵌入表示。
3、图聚类算法分析。结合网络结构、图像信息和多模态图节点特征,设计相似性度量方法Si-SCAN划分社交网络,挖掘人员间的社交关系。
在一具体实施例中,基于多模态数据的社交关系分析方法的分析过程包括:
1、文本数据标签。在社交网络中收集与人员相关的各类社交信息,为便于表示,对采集数据中的所有人员从1,…,N进行编号,每个编号对应一人。首先提取用户相关的所有文本内容,设定从教育背景、生活地域、职业方向、兴趣爱好四方面提取关键文本信息,具体包括学历、学校、工作地、出生地、工作行业、工作岗位、技能标签、兴趣标签八种信息刻画用户特点。
2、提取实体标签信息。采用命名实体识别技术(NER)在海量文本中自动识别人名、地名、机构名、时间、日期等各类实体信息。图3为本实施例中选用的经典的命名实体识别模型BiLSTM加条件随机场(CRF)的模型结构示意图。将文本中的每个字转换为字特征c,采用随机初始化或是预训练模型,通过NER模型输出每个字的预测标签。本实施例选用BIO序列标注。将提取的实体信息与设定好的文本数据标签相对应,每类标签对应的实体信息不超过8个,不足则用字符[UNK]补全,可得到64个实体标签信息。
3、文本特征表示。利用word2vec[Mikolov T,Chen K,Corrado G,et al.Efficient estimation of word representations in vector space[J].arXiv preprint arXiv:1301.3781,2013]词向量模型将实体标签信息转换为文本特征,记为E1,…,EN其中Ei=e1,…,ek,e表示实体标签对应的词向量,k为实体标签最大个数取值64,特征维度de为128维。
4、人员亲密度表示。通过社交网络平台采集人员的合照图像,为了表示合照图片中的人员关联程度,采用统计学方法对所有合照中人员两两出现次数进行统计,作为人员亲密度表示。
5、提取人脸特征。将合照图片信息转换为图像特征。对所有图片去重后利用人脸识别技术提取合照中每个人的人脸特征。通过人脸识别模型得到其在不同合照中的人脸特征Q1,…,QN其中Qi=[q1,…,qm],m<=8,q表示一张合照中第i个人的人脸特征向量,m表示人员的最大合照数量。若该人员合照数小于8张,采用随机向量进行特征补全;若合照数大于8,则按照合照中的人员数量从高到低排序,选取人数最多的前8张图片提取人脸特征。本实施例选用FACENet[Schroff F,Kalenichenko D,Philbin J.Facenet:A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2015:815-823]模型提取人脸特征,特征维度dq为128维。图4为本实施例中选用的人脸特征提取模型结构示意图。
6、构建社交网络图。每个人员代表一个节点,凡是出现在同一张合照中的人,对应节点间由一条无向边连接,构建一个无向的社交网络图。
7、图节点的嵌入表征。本实施例结合图像、文本两个模态信息,设计一种基于transformer的多模态融合模型,对两个模态数据进行深入交互与融合,生成节点的嵌入表示。为减少计算量,将文本与图像特征拼接后输入transformer编码器,以单流模型的方式实现不同模态间的信息交互。
图5为本实施例中的transformer多模态融合模型结构示意图。对于任一人员i,构建文本-图像特征对
Z0=[CLS,E,SEP,Q]
其中,E为人员i的文本特征,Q为人员i的图像特征,[CLS]为标识符,[SEP]为分隔符。
之后,构建文本和图像的位置编码,文本位置编码表示输入的8个实体标签序列信息位置的embedding,图像位置编码表示图像数据集中的每张图像编号的embedding,通过在不同位置随机初始化向量作为位置编码,位置编码的维度与文本特征、图像特征维度一致。
再分别将文本特征和图像特征与文本位置编码、图像位置编码相加,输入到transformer编码器中,在文本和图像两个模态下进行交互学习。通过多层transformer不断迭代更新,学习不同场景下的人脸特征与文本特征间的关联信息,编码器第d层的输出向量为
Zd=Transformer(Zd-1)+Zd-1,d=1,…,6
选用最后一层编码器标识符[CLS]位置的输出向量作为多模态融合特征表示z。优选实施例中设置隐藏层大小为768,多头注意力的头数为12,transformer层数为6。多模态特征融合模型的训练采用图像-文本对齐任务(Image-TextMatching,ITM)。对输入的图像-文本对进行随机替换,再通过模型预测输入的图像和文本之间是否存在对应关系,不断优化模型。
8、图聚类分析。采用图聚类方法对社交网络图进行划分,挖掘人员间的社交关系。本实施例设计了一种Si-SCAN聚类方法,其中节点相似度包括结构相似度、人员亲密度和融合特征相似度。
结构相似度为两节点的共同邻居数与几何平均数的比值,用公式表示为:
其中,σ1(v,w)为节点v和节点w的结构相似度,Γ(v)、Γ(w)分别为节点v、w的邻居节点的集合;
人员亲密度用公式表示为:
σ2(v,w)=α·p(v,w)
其中,σ2(v,w)为节点v和节点w的人员亲密度,p(v,w)为两个节点在社交图像中的共现次数,α为调节系数,本实施例中设置为0.1;
融合特征相似度用公式表示为:
其中,σ3(v,w)为节点v和节点w的融合特征相似度,zv、zw分别为节点v、w的融合特征向量;
最后,节点v和节点w的节点相似度σ(v,w)用公式表示为:
σ(v,w)=σ1(v,w)+σ2(v,w)+σ3(v,w)
定义Si-SCAN算法中,节点v的邻居节点
N(v)={w∈Γ(v)|σ(v,w)≥ε}
核节点为
Si-SCAN与SCAN算法相同,通过计算图中的邻居节点及核节点,对社交网络图中所有节点进行聚类划分。
在具体实施例中,Si-SCAN算法执行步骤如下:
1)初始化时将图中所有节点标记为未分配节点;
2)对于每个未分配的节点v,根据相似度定义判断v是否属于核节点,若不是核节点则标记为非成员,若是核节点则扩展一个新的聚类集群,并将其直接可达节点中标记为未分配的节点与非成员节点分配到该集群,重复直至遍历每个节点;
3)对于标记为非成员的节点,若该节点与两个不同集群相连接则判定为桥节点,否则为离群点。
9、社交关系分析结果输出。通过Si-SCAN得到社交网络图的聚类分析结果,存在社交关联的人员将被划分到同一类中,由此能够挖掘出社交网络中潜在的社交关系,桥节点和离群点的输出也有助于后续对社交网络关系的深入分析。
图6是本申请另一实施例中基于多模态数据的社交关系分析***结构600示意图,包括:
信息提取和特征转换模块601,配置用于提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于人员亲密度构建人员社交网络图;
多模态数据融合模块602,配置用于将文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;
社交关系聚类模块603,配置用于采用Si-SCAN图聚类算法对人员社交网络图进行分析,获得社交关系聚类结果,Si-SCAN图聚类算法在SCAN算法基础上引入人员亲密度和融合特征信息。
图7示出了适于用来实现本申请实施例的电子设备的计算机***700的结构示意图。所电子设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图7所示,计算机***700包括中央处理单元(CPU)701,其可以根据存储在只读存储器(ROM)702中的程序或者从存储部分708加载到随机访问存储器(RAM) 703中的程序而执行各种适当的动作和处理。在RAM 703中,还存储有***700操作所需的各种程序和数据。CPU 701、ROM 702以及RAM 703通过总线704彼此相连。输入/输出(I/O)接口705也连接至总线704。
以下部件连接至I/O接口705:包括键盘、鼠标等的输入部分706;包括诸如液晶显示器(LCD)等以及扬声器等的输出部分707;包括硬盘等的存储部分708;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分709。通信部分709经由诸如因特网的网络执行通信处理。驱动器710也根据需要连接至I/O接口705。可拆卸介质711,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器710上,以便于从其上读出的计算机程序根据需要被安装入存储部分708。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读存储介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分709从网络上被下载和安装,和/或从可拆卸介质711被安装。在该计算机程序被中央处理单元(CPU)701执行时,执行本申请的方法中限定的上述功能。需要说明的是,本申请的计算机可读存储介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读存储介质,该计算机可读存储介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。计算机可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请的操作的计算机程序代码,程序设计语言包括面向对象的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)-连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本申请各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
另一方面,本申请还提供了一种计算机可读存储介质,该计算机可读存储介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读存储介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于人员亲密度构建人员社交网络图;将文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;采用Si-SCAN图聚类算法对人员社交网络图进行分析,获得社交关系聚类结果,Si-SCAN图聚类算法在SCAN算法基础上引入人员亲密度和融合特征信息。
上述实施例中设计了一种基于多模态数据融合的深层社交关系分析方法和***,充分利用社交网络中的文本与图像信息,更全面准确地刻画了不同人员之间的潜在社交关系,并评价关系强度,能够有效挖掘网络中的潜在社交关联。
尽管结合优选实施方案具体展示和介绍了本申请的内容,但所属领域的技术人员应该明白,在不脱离所附权利要求书所限定的本申请的精神和范围内,没有做出创造性劳动的情况下,在形式上和细节上对本申请做出的各种变化,均为本申请的保护范围。

Claims (10)

  1. 一种基于多模态数据的社交关系分析方法,其特征在于,包括以下步骤:
    S1,提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于所述人员亲密度构建人员社交网络图;
    S2,将所述文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;和
    S3,采用Si-SCAN图聚类算法对所述人员社交网络图进行分析,获得社交关系聚类结果,所述Si-SCAN图聚类算法通过在SCAN算法基础上引入所述人员亲密度和所述融合特征信息构建。
  2. 根据权利要求1所述的一种基于多模态数据的社交关系分析方法,其特征在于,S2包括:用于将所述文本特征和图像特征拼接后输入所述多模态融合模型的编码器。
  3. 根据权利要求2所述的一种基于多模态数据的社交关系分析方法,其特征在于,S2还包括:
    构建文本-图像特征对Z0=[CLS,E,SEP,Q],其中,E为人员i的文本特征,Q为人员i的图像特征,[CLS]为标识符,[SEP]为分隔符;
    构建文本位置编码和图像位置编码;
    将文本特征和文本位置编码、图像特征和图像位置编码分别相加,输入到所述编码器;
    选用最后一层编码器的标识符位置的输出向量作为融合特征向量z。
  4. 根据权利要求1所述的一种基于多模态数据的社交关系分析方法,其特征在于,S1中所述构建人员社交网络图具体包括,以人员作为所述人员社交网络图的节点,在同一张社交图像中出现的两个人员所对应的节点间由一条无向边连接。
  5. 根据权利要求1所述的一种基于多模态数据的社交关系分析方法,其特征在于,S3中所述Si-SCAN图聚类算法中节点相似度包括结构相似度、人员亲密度和融合特征相似度。
  6. 根据权利要求5所述的一种基于多模态数据的社交关系分析方法,其特征在于,所述结构相似度为两节点的共同邻居数与几何平均数的比值,用公式表示为:
    其中,σ1(v,w)为节点v和节点w的结构相似度,Γ(v)、Γ(w)分别为节点v、w的邻居节点的集合;
    所述人员亲密度用公式表示为:
    σ2(v,w)=α·p(v,w)
    其中,σ2(v,w)为节点v和节点w的人员亲密度,p(v,w)为两个节点在社交图像中的共现次数,α为调节系数;
    所述融合特征相似度用公式表示为:
    其中,σ3(v,w)为节点v和节点w的融合特征相似度,zv、zw分别为节点v、w的融合特征向量;
    节点v和节点w的节点相似度σ(v,w)用公式表示为:
    σ(v,w)=σ1(v,w)+σ2(v,w)+σ3(v,w)。
  7. 根据权利要求1所述的一种基于多模态数据的社交关系分析方法,其特征在于,S1具体包括:
    采用BiLSTM-CRF模型从社交文本中提取文本标签信息;
    采用词向量模型将所述文本标签信息转换为文本特征。
  8. 根据权利要求1所述的一种基于多模态数据的社交关系分析方法,其特征在于,S1具体包括:
    选用FACENet模型提取图像特征。
  9. 一种基于多模态数据的社交关系分析***,其特征在于,包括:
    信息提取和特征转换模块,配置用于提取人员的社交文本和社交图像信息,分别转换为文本特征和图像特征,并统计人员亲密度,基于所述人员亲密度构建人员社交网络图;
    多模态数据融合模块,配置用于将所述文本特征和图像特征输入基于transformer的多模态融合模型,获得融合特征;
    社交关系聚类模块,配置用于采用Si-SCAN图聚类算法对所述人员社交网络图进行分析,获得社交关系聚类结果,所述Si-SCAN图聚类算法在SCAN算法基础上引入所述人员亲密度和所述融合特征信息。
  10. 一种用于基于多模态数据的社交关系分析的计算机可读存储介质,其上存储有一或多个计算机程序,其特征在于,该一或多个计算机程序被计算机处理器执行时实施权利要求1至8任一项所述的方法。
PCT/CN2023/072957 2022-08-12 2023-01-18 一种基于多模态数据的社交关系分析方法、***和存储介质 WO2024031933A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210971424.7 2022-08-12
CN202210971424.7A CN115293920A (zh) 2022-08-12 2022-08-12 一种基于多模态数据的社交关系分析方法、***和存储介质

Publications (1)

Publication Number Publication Date
WO2024031933A1 true WO2024031933A1 (zh) 2024-02-15

Family

ID=83829716

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/072957 WO2024031933A1 (zh) 2022-08-12 2023-01-18 一种基于多模态数据的社交关系分析方法、***和存储介质

Country Status (3)

Country Link
CN (1) CN115293920A (zh)
WO (1) WO2024031933A1 (zh)
ZA (1) ZA202305628B (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115293920A (zh) * 2022-08-12 2022-11-04 厦门市美亚柏科信息股份有限公司 一种基于多模态数据的社交关系分析方法、***和存储介质
CN115809432B (zh) * 2022-11-21 2024-02-13 中南大学 人群社会关系提取方法、设备及存储介质
CN116758402B (zh) * 2023-08-16 2023-11-28 中国科学技术大学 图像人物关系识别方法、***、设备及存储介质
CN117521017B (zh) * 2024-01-03 2024-04-05 支付宝(杭州)信息技术有限公司 一种获取多模态特征方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120087548A1 (en) * 2010-10-12 2012-04-12 Peng Wu Quantifying social affinity from a plurality of images
US20160019411A1 (en) * 2014-07-15 2016-01-21 Palo Alto Research Center Incorporated Computer-Implemented System And Method For Personality Analysis Based On Social Network Images
CN112508077A (zh) * 2020-12-02 2021-03-16 齐鲁工业大学 一种基于多模态特征融合的社交媒体情感分析方法及***
CN114820011A (zh) * 2021-01-21 2022-07-29 腾讯科技(深圳)有限公司 用户群体聚类方法、装置、计算机设备和存储介质
CN115293920A (zh) * 2022-08-12 2022-11-04 厦门市美亚柏科信息股份有限公司 一种基于多模态数据的社交关系分析方法、***和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120087548A1 (en) * 2010-10-12 2012-04-12 Peng Wu Quantifying social affinity from a plurality of images
US20160019411A1 (en) * 2014-07-15 2016-01-21 Palo Alto Research Center Incorporated Computer-Implemented System And Method For Personality Analysis Based On Social Network Images
CN112508077A (zh) * 2020-12-02 2021-03-16 齐鲁工业大学 一种基于多模态特征融合的社交媒体情感分析方法及***
CN114820011A (zh) * 2021-01-21 2022-07-29 腾讯科技(深圳)有限公司 用户群体聚类方法、装置、计算机设备和存储介质
CN115293920A (zh) * 2022-08-12 2022-11-04 厦门市美亚柏科信息股份有限公司 一种基于多模态数据的社交关系分析方法、***和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIA-ZHOU CHEN ,, CHEN ZHANG-ZHANG, QIN, XU-JIA: "Photo-based Social Relationship Visualization Method", JOURNAL OF CHINESE COMPUTER SYSTEMS, vol. 41, no. 10, 15 October 2020 (2020-10-15), pages 2194 - 2199, XP093137148 *

Also Published As

Publication number Publication date
CN115293920A (zh) 2022-11-04
ZA202305628B (en) 2023-12-20

Similar Documents

Publication Publication Date Title
Zheng et al. Knowledge base graph embedding module design for Visual question answering model
Lian et al. xdeepfm: Combining explicit and implicit feature interactions for recommender systems
WO2024031933A1 (zh) 一种基于多模态数据的社交关系分析方法、***和存储介质
Zhang et al. Network representation learning: A survey
CN112966127B (zh) 一种基于多层语义对齐的跨模态检索方法
Yin et al. DHNE: Network representation learning method for dynamic heterogeneous networks
Vadicamo et al. Cross-media learning for image sentiment analysis in the wild
Khoshraftar et al. A survey on graph representation learning methods
Wang et al. Multi-task learning based network embedding
CN113761250A (zh) 模型训练方法、商户分类方法及装置
Moyano Learning network representations
Gao et al. Multi-scale features based interpersonal relation recognition using higher-order graph neural network
Wu et al. Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks
Maurya et al. Deceptive opinion spam detection approaches: a literature survey
CN116737979A (zh) 基于上下文引导多模态关联的图像文本检索方法及***
Ding et al. User identification across multiple social networks based on naive Bayes model
Wei et al. Sentiment classification of tourism reviews based on visual and textual multifeature fusion
Huang et al. A network representation learning method fusing multi-dimensional classification information of nodes
Tu et al. PhraseMap: Attention-based keyphrases recommendation for information seeking
Huang et al. Design knowledge graph-aided conceptual product design approach based on joint entity and relation extraction
Fan et al. Predicting image emotion distribution by learning labels’ correlation
Agarwal et al. From methods to datasets: A survey on Image-Caption Generators
Yuan et al. Sign prediction on unlabeled social networks using branch and bound optimized transfer learning
Wang et al. Doufu: a double fusion joint learning method for driving trajectory representation
CN115169285A (zh) 一种基于图解析的事件抽取方法及***

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23851169

Country of ref document: EP

Kind code of ref document: A1