KR20220054657A

KR20220054657A - Interaction relationship recognition method, device, device and storage medium

Info

Publication number: KR20220054657A
Application number: KR1020227010608A
Authority: KR
Inventors: 웨 랴오; 옌제 천; 페이 왕; 천 첸
Original assignee: 상하이 센스타임 린강 인텔리전트 테크놀로지 컴퍼니 리미티드
Priority date: 2020-02-18
Filing date: 2021-02-09
Publication date: 2022-05-03
Also published as: CN111325141A; CN111325141B; JP2023514322A; WO2021164662A1

Abstract

본 발명은 인터랙션 관계 인식 방법, 장치, 기기 및 저장 매체에 관한 것으로, 상기 방법은, 처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득하는 단계; 상기 인체 검출 결과에 기반하여 인체에 대응하는 각 인체 부위 영역을 결정하는 단계; 상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 구축하되, 여기서, 상기 물체 영역은 물체 검출 결과에서 물체 검출 박스에 대응하는 영역이고, 상기 공간 위치 관계도는 각각의 상기 인체 부위 노드의 특징 정보, 상기 물체 노드의 특징 정보 및 각각의 상기 인체 부위 노드와 상기 물체 노드의 위치 관계 정보를 포함하는 단계; 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계를 포함한다.The present invention relates to a method, apparatus, apparatus and storage medium for recognizing an interaction relationship, the method comprising the steps of: detecting an image waiting to be processed to obtain a human body detection result and an object detection result; determining each body part region corresponding to the human body based on the human body detection result; A spatial positional relationship diagram between the human body and the object is constructed based on the human body part node corresponding to the body part region and the object node corresponding to the object region, wherein the object region corresponds to the object detection box in the object detection result. region, wherein the spatial positional relationship diagram includes feature information of each of the human body part nodes, feature information of the object node, and positional relationship information between each of the human body part nodes and the object node; and determining an interaction relationship between the human body and the object based on a spatial positional relationship diagram between the human body and the object.

Description

인터랙션 관계 인식 방법, 장치, 기기 및 저장 매체Interaction relationship recognition method, device, device and storage medium

본 발명은 컴퓨터 비전 분야에 속하는 것으로, 특히 인터랙션 관계 인식 방법, 장치, 기기 및 저장 매체에 관한 것이다.The present invention relates to the field of computer vision, and more particularly, to a method, apparatus, apparatus and storage medium for recognizing an interaction relationship.

사람과 물체의 인터랙션 관계 인식은 도시 지능형 감시, 가정 지능형 모니터링 등 분야에서 광범위한 응용 전망을 가지고 있다. 사람과 물체가 인터랙션하는 과정 중 사람은 형태가 상이한 자세를 가지게 되고, 사람과 물체 사이도 상이한 위치 관계를 구비하게 되므로, 사람과 물체의 인터랙션 관계 인식은 이상의 정보를 충분히 발굴하여 사람과 물체 사이의 인터랙션을 충분히 이해해야 한다.Recognition of the interaction relationship between people and objects has a wide range of application prospects in urban intelligent monitoring and home intelligent monitoring. In the process of interaction between people and objects, people have different postures and have different positional relationships between people and objects. Interactions must be fully understood.

최근 몇 년래, 딥 러닝은 단일 물체 이해 방면에서 매우 큰 진전을 취득하였으나 사람과 물체 사이의 관계 이해에 대한 연구는 여전히 초기 단계에 머물러 있다.In recent years, deep learning has made great strides in understanding single objects, but research on understanding the relationship between people and objects is still in its infancy.

본 발명의 실시예는 인터랙션 관계 인식 방안을 제공한다.An embodiment of the present invention provides a method for recognizing an interaction relationship.

본 발명의 일 양태에 따르면, 인터랙션 관계 인식 방법을 제공하는 바, 처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득하는 단계; 상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정하는 단계; 상기 물체 검출 결과에 기반하여 상기 처리 대기 이미지에서 물체에 대응하는 물체 영역을 결정하는 단계 - 상기 물체 영역은 상기 물체 검출 결과 중의 물체 검출 박스에 대응하는 영역임 -; 상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정하는 단계 - 상기 공간 위치 관계도는 각각의 상기 인체 부위 노드의 특징 정보, 상기 물체 노드의 특징 정보 및 각각의 상기 인체 부위 노드와 상기 물체 노드의 위치 관계 정보를 포함함 -; 및 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계를 포함한다.According to one aspect of the present invention, there is provided a method for recognizing an interaction relationship, the method comprising the steps of: detecting an image waiting to be processed to obtain a human body detection result and an object detection result; determining each body part region corresponding to the human body in the processing standby image based on the human body detection result; determining an object area corresponding to an object in the processing standby image based on the object detection result, wherein the object area is an area corresponding to an object detection box in the object detection result; determining a spatial position relation diagram between a human body and an object based on a human body part node corresponding to the human body part region and an object node corresponding to the object region; , including feature information of the object node and positional relationship information between each of the human body part nodes and the object node; and determining an interaction relationship between the human body and the object based on a spatial positional relationship diagram between the human body and the object.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정하는 상기 단계는, 상기 인체 검출 결과 중의 인체 검출 박스에 포함되는 특징 정보를 획득하는 단계; 상기 특징 정보에 기반하여 상기 인체의 인체 키 포인트를 획득하는 단계; 인체 뼈대 정보에 기반하여 상기 인체 키 포인트를 연결하여 연결 정보를 획득하는 단계; 및 상기 인체 키 포인트 및 연결 정보에 기반하여 각 인체 부위 영역을 결정하는 단계를 포함한다.In conjunction with any one embodiment provided by the present invention, the step of determining each body part region corresponding to the human body in the processing standby image based on the human body detection result may include: obtaining included feature information; acquiring a key point of the human body based on the characteristic information; acquiring connection information by linking the key points of the human body based on human skeleton information; and determining each body part region based on the body key point and connection information.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 인체 키 포인트 및 연결 정보에 기반하여 각 인체 부위 영역을 결정하는 단계는, 서로 연결된 복수의 인체 키 포인트에 기반하여 하나의 인체 부위 영역을 결정하는 단계, 또는 상기 복수의 인체 키 포인트의 하나를 중심으로 하나의 인체 부위 영역을 결정하는 단계 중 적어도 하나를 포함한다.In conjunction with any one embodiment provided by the present invention, the step of determining each body part region based on the human body key point and the connection information includes selecting one body part region based on a plurality of human body key points connected to each other. and determining at least one of determining, or determining one body part region based on one of the plurality of body key points.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정하는 상기 단계는, 상기 인체 부위 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 인체 부위 노드의 특징 정보를 획득하는 단계; 상기 물체 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 물체 노드의 특징 정보를 획득하는 단계; 동일한 인체에 대하여, 인체 뼈대 정보에 기반하여 각각의 상기 인체 부위 노드를 연결하는 단계; 및 상기 물체 노드와 상기 인체 부위 노드를 연결시켜 인체와 물체의 공간 위치 관계도를 획득하는 단계 - 하나의 물체 노드와 하나의 인체 부위 노드가 연결되어 형성된 에지의 특징 정보는 상기 에지가 연결한 물체 노드와 인체 부위 노드의 위치 관계 정보를 포함함 - 를 포함한다.In conjunction with any one embodiment provided by the present invention, the step of determining the spatial positional relationship between the human body and the object based on the body part node corresponding to the body part region and the object node corresponding to the object region may include: , obtaining characteristic information of the human body part node by performing dimension reduction on the characteristic information of the human body part region; obtaining characteristic information of the object node by performing dimension reduction on the characteristic information of the object region; for the same human body, connecting each of the human body part nodes based on the human body skeleton information; and connecting the object node and the human body part node to obtain a spatial positional relationship diagram between the human body and the object. Includes information on the positional relationship between the node and the human body part node.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 물체 노드와 상기 인체 부위 노드를 연결시키는 단계는, 각 물체 노드에 대하여, 상기 물체 노드와의 거리가 제일 가까운 설정 수량의 인체 부위 노드를 상기 물체 노드와 각각 연결하는 단계를 포함한다.In conjunction with any one embodiment provided by the present invention, the step of connecting the object node and the human body part node includes: for each object node, a set quantity of human body part nodes having the closest distance to the object node. and connecting each with the object node.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 인체와 물체의 공간 위치 관계도를 획득한 다음, 상기 방법은, 각 인체 부위 노드에 대하여, 상기 인체 부위 노드의 하나 또는 복수의 인접하는 인체 부위 노드의 특징 정보 및 상기 인체 부위 노드와 상기 인접하는 인체 부위 노드를 서로 연결한 에지의 특징 정보를 이용하여 각각의 상기 인체 부위 노드의 특징 정보를 업데이트하는 단계를 더 포함한다.In conjunction with any one embodiment provided by the present invention, after obtaining a spatial positional relationship diagram between a human body and an object, the method includes, for each human body part node, one or a plurality of adjacent anatomical parts of the human body part node The method further includes updating the characteristic information of each of the human body part nodes by using the characteristic information of the human body part node and the characteristic information of an edge connecting the human body part node and the adjacent human body part node to each other.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계는, 상기 인체 부위 노드의 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득하는 단계; 상기 물체 노드의 특징 정보에 기반하여 상기 물체에 대응하는 특징 정보를 획득하는 단계; 및 상기 인체에 대응하는 특징 정보 및 상기 물체에 대응하는 특징 정보에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계를 포함한다.In conjunction with any one embodiment provided by the present invention, the step of determining the interaction relationship between the human body and the object based on the spatial positional relationship between the human body and the object may include: acquiring characteristic information corresponding to the human body; obtaining characteristic information corresponding to the object based on the characteristic information of the object node; and determining an interaction relationship between the human body and the object based on the characteristic information corresponding to the human body and the characteristic information corresponding to the object.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 인체 부위 노드의 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득하는 단계는, 동일한 인체에 대하여, 각 인체 부위 노드의 특징 정보에 대해 글로벌 풀링 조작을 수행하여 상기 인체에 대응하는 특징 정보를 획득하는 단계를 포함한다.In conjunction with any one embodiment provided by the present invention, the step of obtaining the characteristic information corresponding to the human body based on the characteristic information of the human body part node includes: for the same human body, the characteristic information of each human body part node and performing a global pooling operation on the object to obtain feature information corresponding to the human body.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 상기 단계는, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 단계를 포함하고; 상기 방법은, 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 안전 계수가 제1 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 단계를 더 포함한다.In conjunction with any one embodiment provided by the present invention, the step of determining the interaction relationship between the human body and the object based on the spatial positional relationship diagram of the human body and the object includes: determining an interaction category to which an interaction relationship between the human body and the object belongs based on the; The method further includes determining that the human body is placed in a target scene in response to a safety factor of an interaction category to which the interaction relationship between the human body and the object belongs is lower than a first set threshold value.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계는, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 단계를 포함하고; 상기 방법은, 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 조합의 안전 계수를 결정하는 단계; 및 상기 조합의 안전 계수가 제2 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 단계를 더 포함한다.In conjunction with any one embodiment provided by the present invention, the step of determining the interaction relationship between the human body and the object based on the spatial positional relationship diagram of the human body and the object is based on the spatial positional relationship diagram of the human body and the object and determining an interaction category to which an interaction relationship between the human body and an object of a different category belongs; The method includes: determining a safety factor of a combination of interaction categories to which an interaction relationship between the human body and an object of a different category belongs; and determining that the human body is placed in the target scene in response to the safety factor of the combination being lower than a second set threshold value.

본 발명의 일 양태에 따르면, 인터랙션 관계 인식 장치를 제공하는 바, 처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득하는 획득 유닛; 상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정하고, 상기 물체 검출 결과에 기반하여 상기 처리 대기 이미지에서 물체에 대응하는 물체 영역을 결정하되, 여기서, 상기 물체 영역은 상기 물체 검출 결과 중의 물체 검출 박스에 대응하는 영역인 제1 결정 유닛; 상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정하되, 여기서, 상기 공간 위치 관계도는 각각의 상기 인체 부위 노드의 특징 정보, 상기 물체 노드의 특징 정보 및 각각의 상기 인체 부위 노드와 상기 물체 노드의 위치 관계 정보를 포함하는 제2 결정 유닛; 및 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 인식 유닛을 포함한다.According to one aspect of the present invention, there is provided an interaction relationship recognition apparatus, comprising: an acquisition unit configured to detect an image waiting to be processed to obtain a human body detection result and an object detection result; Each body part region corresponding to the human body is determined in the processing standby image based on the human body detection result, and an object region corresponding to the object in the processing standby image is determined based on the object detection result, wherein the object a first determining unit, wherein the area is an area corresponding to the object detection box in the object detection result; A spatial positional relationship diagram between the human body and the object is determined based on a human body part node corresponding to the human body part region and an object node corresponding to the object region, wherein the spatial positional relationship diagram is a feature of each human body part node a second determining unit including information, characteristic information of the object node, and positional relationship information of each of the human body part nodes and the object nodes; and a recognition unit configured to determine an interaction relationship between the human body and the object based on a spatial positional relationship diagram between the human body and the object.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 제1 결정 유닛은 구체적으로, 상기 인체 검출 결과 중의 인체 검출 박스에 포함되는 특징 정보를 획득하고; 상기 특징 정보에 기반하여 상기 인체의 인체 키 포인트를 획득하며; 인체 뼈대 정보에 기반하여 상기 인체 키 포인트를 연결하여 연결 정보를 획득하고; 상기 인체 키 포인트 및 연결 정보에 기반하여 각 인체 부위 영역을 결정하는 것은, 서로 연결된 복수의 인체 키 포인트에 기반하여 하나의 인체 부위 영역을 결정하거나, 또는 상기 복수의 인체 키 포인트의 하나를 중심으로 하나의 인체 부위 영역을 결정하는 것 중 적어도 하나를 포함하는 데에 사용된다.In conjunction with any one embodiment provided by the present invention, the first determining unit is specifically configured to: acquire feature information included in a human body detection box in the human body detection result; acquiring a key point of the human body based on the characteristic information; connecting the key points of the human body based on the human skeleton information to obtain connection information; Determining each body part region based on the human body key point and the connection information may include determining one body part region based on a plurality of human body key points connected to each other, or using one of the plurality of human body key points as a center. used to include at least one of determining a region of one body part.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 제2 결정 유닛은 구체적으로, 상기 인체 부위 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 인체 부위 노드의 특징 정보를 획득하고; 상기 물체 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 물체 노드의 특징 정보를 획득하며; 동일한 인체에 대하여, 인체 뼈대 정보에 기반하여 각각의 상기 인체 부위 노드를 연결하고; 상기 물체 노드와 상기 인체 부위 노드를 연결시켜 인체와 물체의 공간 위치 관계도를 획득하는 것은, 각 물체 노드에 대하여, 상기 물체 노드와의 거리가 제일 가까운 설정 수량의 인체 부위 노드를 상기 물체 노드와 각각 연결하는 것을 포함하는 것에 사용되되, 여기서, 하나의 물체 노드와 하나의 인체 부위 노드가 연결되어 형성된 에지의 특징 정보는 상기 에지가 연결한 물체 노드와 인체 부위 노드의 위치 관계 정보를 포함한다.In conjunction with any one embodiment provided by the present invention, the second determining unit is specifically configured to: perform dimension reduction on the characteristic information of the human body part region to obtain characteristic information of the human body part node; performing dimension reduction on the characteristic information of the object region to obtain characteristic information of the object node; for the same human body, connecting each of the human body part nodes based on the human body skeleton information; Connecting the object node and the human body part node to obtain a spatial positional relationship diagram between the human body and the object includes: for each object node, a set quantity of human body part nodes having the closest distance from the object node to the object node It is used to include connecting each, wherein the characteristic information of the edge formed by connecting one object node and one body part node includes positional relationship information between the object node and the body part node connected by the edge.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 장치는, 각 인체 부위 노드에 대하여, 상기 인체 부위 노드의 하나 또는 복수의 인접하는 인체 부위 노드의 특징 정보 및 상기 인체 부위 노드와 상기 인접하는 인체 부위 노드를 서로 연결한 에지의 특징 정보를 이용하여 각각의 상기 인체 부위 노드의 특징 정보를 업데이트하는 업데이트 유닛을 더 포함한다.In conjunction with any one embodiment provided by the present invention, the device includes, for each body part node, characteristic information of one or a plurality of adjacent body part nodes of the body part node, and the body part node and the adjacent body part node. and an update unit configured to update the characteristic information of each of the human body part nodes by using the characteristic information of edges connecting the human body part nodes to each other.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 인식 유닛은 구체적으로, 상기 인체 부위 노드의 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득하는 것이, 동일한 인체에 대하여, 각 인체 부위 노드의 특징 정보에 대해 글로벌 풀링 조작을 수행하여 상기 인체에 대응하는 특징 정보를 획득하는 것을 포함하고; 상기 물체 노드의 특징 정보에 기반하여 상기 물체에 대응하는 특징 정보를 획득하며; 상기 인체에 대응하는 특징 정보 및 상기 물체에 대응하는 특징 정보에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 것을 포함하는 데에 사용된다.In conjunction with any one embodiment provided by the present invention, the recognition unit is specifically configured to obtain characteristic information corresponding to the human body based on the characteristic information of the human body part node, for the same human body, each human body performing a global pooling operation on the feature information of the regional node to obtain feature information corresponding to the human body; acquiring characteristic information corresponding to the object based on the characteristic information of the object node; and determining an interaction relationship between the human body and the object based on the characteristic information corresponding to the human body and the characteristic information corresponding to the object.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 인식 유닛은 구체적으로, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 데에 사용되고; 상기 장치는, 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 안전 계수가 제1 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 제3 결정 유닛을 더 포함한다.In conjunction with any one embodiment provided by the present invention, the recognition unit is specifically configured to determine an interaction category to which an interaction relationship between the human body and the object belongs, based on the spatial positional relationship diagram of the human body and the object. used; The apparatus further includes a third determining unit, configured to determine that the human body is placed in the target scene in response to a safety factor of an interaction category to which the interaction relationship between the human body and the object belongs is lower than a first set threshold value.

본 발명이 제공하는 임의의 한 실시형태와 결부하면, 상기 인식 유닛은 구체적으로, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 데에 사용되고; 상기 장치는, 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 조합의 안전 계수를 결정하고; 상기 조합의 안전 계수가 제2 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 제4 결정 유닛을 더 포함한다.In conjunction with any one embodiment provided by the present invention, the recognition unit is specifically configured to determine an interaction category to which an interaction relationship between the human body and an object of a different category belongs based on a spatial positional relationship diagram between the human body and the object. used for; The apparatus is configured to: determine a safety factor of a combination of interaction categories to which an interaction relationship between the human body and an object of a different category belongs; and a fourth determining unit, configured to determine that the human body is placed in the target scene in response to the safety factor of the combination being lower than a second set threshold value.

본 발명의 일 양태에 따르면, 메모리, 프로세서를 포함하는 전자 기기를 제공하는 바, 상기 메모리는 프로세서에서 실행될 수 있는 컴퓨터 명령을 저장하고, 상기 프로세서는 상기 컴퓨터 명령을 실행할 경우, 본 발명의 임의의 한 실시형태에 따른 인터랙션 관계 인식 방법을 구현한다.According to one aspect of the present invention, there is provided an electronic device including a memory and a processor, wherein the memory stores computer instructions executable by the processor, and when the processor executes the computer instructions, any of the present invention A method for recognizing an interaction relationship according to an embodiment is implemented.

본 발명의 일 양태에 따르면, 컴퓨터 프로그램이 저장되는 컴퓨터 판독 가능 저장 매체를 제공하는 바, 상기 프로그램이 프로세서에 의해 실행될 경우, 본 발명의 임의의 한 실시형태에 따른 인터랙션 관계 인식 방법을 구현한다.According to an aspect of the present invention, there is provided a computer readable storage medium storing a computer program. When the program is executed by a processor, the interaction relationship recognition method according to any one embodiment of the present invention is implemented.

본 발명의 일 양태에 따르면, 컴퓨터 프로그램을 제공하는 바, 이가 프로세서에 의해 실행될 경우, 본 발명의 임의의 한 실시형태에 따른 인터랙션 관계 인식 방법을 구현한다.According to one aspect of the present invention, there is provided a computer program, which, when executed by a processor, implements the interaction relationship recognition method according to any one embodiment of the present invention.

본 발명의 하나 또는 복수의 실시형태에 따른 인터랙션 관계 인식 방법, 장치, 기기 및 저장 매체는, 처리 대기 이미지의 인체 검출 결과와 물체 검출 결과에 기반하여 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역 및 물체에 대응하는 물체 영역을 결정하고, 대응되는 노드로 전환하며, 상기 노드에 기반하여 사람과 물체의 공간 위치 관계도를 구축하되, 상기 공간 위치 관계도는 상이한 인체 자세에 대응하는 특징을 포함할 뿐만 아니라 각 신체 부위와 물체의 위치 관계도 포함하며, 상기 공간 위치를 이용하여 인체에 대응하는 특징 정보 및 물체에 대응하는 특징 정보를 획득함으로써 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하고 인터랙션 관계 인식의 정확성과 신뢰도를 향상시킨다.The interaction relationship recognition method, apparatus, device and storage medium according to one or more embodiments of the present invention include each body part region corresponding to the human body in the processing waiting image based on the human body detection result and the object detection result of the processing standby image and determining an object region corresponding to the object, switching to a corresponding node, and constructing a spatial position relation diagram between a person and an object based on the node, wherein the spatial position relation diagram includes features corresponding to different human body postures Not only does it include the positional relationship between each body part and the object, and by using the spatial position to obtain characteristic information corresponding to the human body and the characteristic information corresponding to the object, the interaction relationship between the human body and the object is determined and the interaction Improve the accuracy and reliability of relationship recognition.

이상의 일반적인 설명과 아래 문장의 세부적인 설명은 단지 예시적이고 설명적인 것으로, 본 발명을 한정할 수 없다는 것을 이해해야 한다.It should be understood that the above general description and detailed description of the sentences below are merely exemplary and descriptive, and do not limit the present invention.

여기의 도면은 명세서에 결부되어 본 명세서의 일부분을 구성하고, 본 발명에 부합되는 실시예를 도시하며, 명세서와 함께 본 발명의 원리를 해석한다.
도 1은 본 발명의 적어도 한 실시예에 따른 인터랙션 관계 인식 방법의 흐름도를 도시한다;
도 2는 본 발명의 적어도 한 실시예에 따른 타깃 검출 방법의 흐름도를 도시한다;
도 3a는 본 발명의 적어도 한 실시예에 따른 타깃 검출 방법으로 얻은 인체 검출 결과를 도시한다;
도 3b는 도 3a 중의 인체 검출 결과에 기반하여 결정한 인체 키 포인트를 도시한다;
도 4는 본 발명의 적어도 한 실시예에 따른 인터랙션 관계 인식 장치의 구조 모식도를 도시한다;
도 5는 본 발명의 적어도 한 실시예에 따른 전자 기기의 구조도를 도시한다.BRIEF DESCRIPTION OF THE DRAWINGS The drawings herein, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and, together with the specification, interpret the principles of the invention.
1 shows a flowchart of a method for recognizing an interaction relationship according to at least one embodiment of the present invention;
2 depicts a flow diagram of a target detection method according to at least one embodiment of the present invention;
3A illustrates a human body detection result obtained by a target detection method according to at least one embodiment of the present invention;
Fig. 3B shows the key points of the human body determined based on the human body detection result in Fig. 3A;
4 is a structural schematic diagram of an apparatus for recognizing an interaction relationship according to at least one embodiment of the present invention;
5 is a structural diagram of an electronic device according to at least one embodiment of the present invention.

예시적인 실시예들이 도면들에서 그 예시들이 표현되어, 이하에서 상세히 설명될 것이다. 다음 설명들이 도면들을 포함할 때, 상이한 도면들에서의 유사한 번호들은 달리 표시되지 않는 한 유사하거나 비슷한 요소들을 지칭한다. 이하의 예시적인 실시예들에서 설명되는 특정 방식은 본 출원에 따른 모든 실시예들을 나타내지는 않는다. 오히려, 이들은 첨부된 청구항들에 전술된 바와 같은 본 출원의 일부 양태들에 따른 장치들 및 방법들의 예들에 불과하다.Exemplary embodiments will be described in detail below, with examples of which are represented in the drawings. When the following descriptions include drawings, like numbers in different drawings refer to similar or like elements unless otherwise indicated. A specific manner described in the following exemplary embodiments does not represent all embodiments according to the present application. Rather, these are merely examples of apparatuses and methods according to some aspects of the present application as set forth above in the appended claims.

본 발명에서, "및/또는"이라는 용어는 연관 객체를 설명하는 연관 관계일 뿐이고, 3 가지 관계가 존재할 수 있음을 나타내며, 예를 들면, A 및/또는 B는 A만 존재, A와 B가 동시에 존재, B만 존재하는 3 가지 경우를 나타낼 수 있다. 또한, 본 발명에서 사용되는 "적어도 하나"라는 용어는 여러 가지 중의 임의의 한 가지 또는 여러 가지 중의 적어도 두 가지의 임의의 조합을 나타내는 바, 예를 들면 A, B, C에서의 적어도 하나를 포함하는 것은, A, B 및 C로 구성된 집합에서 선택한 임의의 하나 또는 복수의 요소를 포함한다는 것을 나타낼 수 있다.In the present invention, the term "and/or" is merely an association relationship describing an associated object, and indicates that three relationships may exist, for example, A and/or B is only A, A and B are It can represent three cases in which only B exists at the same time. In addition, the term "at least one" used in the present invention refers to any one of several or any combination of at least two of several, for example, including at least one of A, B, and C. Doing may indicate including any one or a plurality of elements selected from the set consisting of A, B and C.

본 발명의 적어도 한 실시예는 인터랙션 관계 인식 방법을 제공하는 바, 상기 인터랙션 관계 인식 방법은 단말 디바이스 또는 서버와 같은 전자 디바이스에 의해 수행될 수 있다. 단말 디바이스는 사용자 장비(UE), 모바일 디바이스, 사용자 단말, 단말, 셀룰러 폰, 무선 전화, PDA(personal digital assistant), 핸드헬드 디바이스, 컴퓨팅 디바이스, 차량내 디바이스(in-vehicle device), 웨어러블 디바이스 등일 수 있다.At least one embodiment of the present invention provides a method for recognizing an interaction relationship, wherein the method for recognizing an interaction relationship may be performed by an electronic device such as a terminal device or a server. A terminal device may be a user equipment (UE), a mobile device, a user terminal, a terminal, a cellular phone, a wireless telephone, a personal digital assistant (PDA), a handheld device, a computing device, an in-vehicle device, a wearable device, etc. can

도 1은 본 발명의 적어도 한 실시예에 따른 인터랙션 관계 인식 방법의 흐름도를 도시하는 바, 도 1에 도시된 바와 같이, 상기 방법은 단계101 내지 단계104를 포함한다.1 is a flowchart of a method for recognizing an interaction relationship according to at least one embodiment of the present invention. As shown in FIG. 1 , the method includes steps 101 to 104 .

단계101에서, 처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득한다.In step 101, the image to be processed is detected to obtain a human body detection result and an object detection result.

본 발명의 실시예에서, 처리 대기 이미지는 이미지 수집 장치(예를 들면 카메라)에 의해 획득된 이미지로서, 이는 비디오 스트림 중의 한 프레임 일 수도 있고, 실시간으로 획득한 이미지 일 수도 있으며; 상기 처리 대기 이미지는 컬러 이미지(RGB 이미지)일 수도 있고 적외선/근적외선 이미지 일 수도 있으며 본 발명은 이에 대해 한정하지 않는다.In an embodiment of the present invention, the image waiting to be processed is an image acquired by an image acquisition device (eg, a camera), which may be one frame in a video stream or an image acquired in real time; The processing standby image may be a color image (RGB image) or an infrared/near-infrared image, but the present invention is not limited thereto.

딥 러닝 네트워크를 이용하여 상기 처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득할 수 있다. 인체 또는 물체를 검출한 경우, 검출 결과는 검출 박스, 검출 박스의 위치, 검출 박스의 카테고리 등을 포함할 수 있다. 딥 러닝 네트워크를 이용하여 처리 대기 이미지를 검출하는 구체적인 방법은 나중에 상세히 설명한다.A human body detection result and an object detection result may be obtained by detecting the image waiting to be processed using a deep learning network. When a human body or an object is detected, the detection result may include a detection box, a location of the detection box, a category of the detection box, and the like. A specific method for detecting an image waiting to be processed using a deep learning network will be described in detail later.

단계102에서, 상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정한다.In step 102, each body part region corresponding to the human body is determined in the processing standby image based on the human body detection result.

사람과 물체 사이의 여러 가지 인터랙션이 각각 어떠한 신체 부위에 의해 실행될 것인지는 통상적으로 이미 결정된 것인 바, 예를 들면 전화 통화는 통상적으로 전화와 사람의 손, 머리 사이의 인터랙션과 관련된다. 따라서, 인체 검출 결과를 획득한 기초하에, 인체 검출 결과 중의 인체 검출 박스에 대응하는 영역에 기반하여 상기 인체에 대응하는 각 신체 부위 영역을 추가로 결정할 수 있고, 나아가 신체 부위와 물체 사이에 발생할 수 있는 인터랙션 관계를 판정할 수 있다. 여기서, 인체에 포함되는, 물체와 인터랙션이 발생할 수 있는 신체 부위는 실제 수요에 따라 구체적으로 정의할 수 있으며 본 발명은 이에 대해 한정하지 않는다.It is usually already determined by which body part each of the various interactions between a person and an object will be performed, for example, a phone call is usually related to an interaction between the phone and the person's hand and head. Accordingly, on the basis of obtaining the human body detection result, each body part region corresponding to the human body may be further determined based on the region corresponding to the human body detection box in the human body detection result, and furthermore, the occurrence of occurrence between the body part and the object may occur. It is possible to determine an interaction relationship that exists. Here, a body part included in the human body and capable of interacting with an object may be specifically defined according to actual demand, and the present invention is not limited thereto.

단계103에서, 상기 인체 부위 영역에 대응하는 인체 부위 노드 및 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정한다.In step 103, a spatial positional relationship between the human body and the object is determined based on the body part node corresponding to the body part region and the object node corresponding to the object region.

여기서, 상기 물체 영역은 물체 검출 결과 중 물체 검출 박스에 대응하는 영역이며, 상기 공간 위치 관계도는 각각의 상기 인체 부위 노드의 특징 정보, 상기 물체 노드의 특징 정보 및 각각의 상기 인체 부위 노드와 상기 물체 노드의 위치 관계 정보를 포함한다.Here, the object region is a region corresponding to the object detection box among the object detection results, and the spatial positional relationship diagram shows the characteristic information of each of the human body part nodes, the characteristic information of the object nodes, and each of the human body part nodes and the Contains positional relationship information of object nodes.

일부 실시예에서, 상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드는 미리 트레이닝된 신경망에 의해 획득될 수 있다. 예를 들면, 상기 인체 부위 영역의 특징 정보를 상기 신경망에 입력하여 차원 축소를 수행함으로써 상기 인체 부위 영역에 대응하는 인체 부위 노드의 특징 정보를 획득하고, 이로써 인체 부위 영역을 인체 부위 노드로 전환한다. 마찬가지로, 물체 노드에 대하여, 상기 물체 영역의 특징 정보를 상기 신경망에 입력하여 차원 축소를 수행함으로써 상기 물체 영역에 대응하는 물체 노드의 특징 정보를 획득하고, 이로써 물체 영역을 물체 노드로 전환할 수도 있다. 후속 과정에서 사용되는 신경망과 구별하기 위하여 여기서 인체 부위 노드와 물체 노드를 획득하는 신경망을 제1 신경망이라 칭한다. 상기 인체 부위 영역과 상기 물체 부위 영역의 특징 정보의 획득 방식은 나중에 상세히 설명한다.In some embodiments, the body part node corresponding to the body part region and the object node corresponding to the object region may be obtained by a pre-trained neural network. For example, by inputting the characteristic information of the human body region into the neural network and performing dimension reduction, the characteristic information of the human body part node corresponding to the human body part region is obtained, thereby converting the human body part region into a human body part node . Similarly, for an object node, by inputting the characteristic information of the object zone into the neural network to perform dimension reduction, the characteristic information of the object node corresponding to the object zone may be obtained, thereby converting the object zone into an object node. . In order to distinguish it from a neural network used in a subsequent process, a neural network that acquires a body part node and an object node is referred to as a first neural network. A method of acquiring characteristic information of the body part region and the object part region will be described in detail later.

처리 대기 이미지에 포함되는 하나 또는 복수의 인체에 대응하는 인체 부위 노드 및 하나 또는 복수의 물체 노드에 대하여, 우선 각 인체에 대응하는 인체 부위 노드를 연결시켜 인체 노드 그래프를 획득하고; 그 다음 각 물체 노드 및 각 인체에 대응하는 인체 부위 노드를 연결시켜 인체와 물체의 공간 위치 관계도를 획득할 수 있다. 각 물체 노드 및 각 인체 부위 노드 사이의 구체적인 연결방식에 대하여 본 발명의 실시예는 한정하지 않는다.with respect to one or more human body part nodes and one or more object nodes included in the waiting image to be processed, first connect the human body part nodes corresponding to each human body to obtain a human body node graph; Then, each object node and the human body part node corresponding to each human body are connected to obtain a spatial positional relationship diagram between the human body and the object. The embodiment of the present invention is not limited with respect to a specific connection method between each object node and each body part node.

상기 공간 위치 관계도는 각 인체 부위 노드, 물체 노드 사이의 연결 정보를 포함할 뿐만 아니라 각 인체 부위 노드, 물체 노드에 대응하는 특징 정보 및 상기 물체 노드와 인체 부위 노드가 연결되어 형성된 "에지"에 대응하는 특징 정보를 포함하되, 여기서 에지의 특징 정보는 연결된 물체 노드와 인체 부위 노드의 상대적 위치에 기반하여 획득될 수 있다.The spatial location relationship diagram includes not only connection information between each human body part node and object node, but also feature information corresponding to each human body part node and object node, and an "edge" formed by connecting the object node and body part node. and corresponding feature information, wherein the feature information of the edge may be obtained based on the relative positions of the connected object node and the human body part node.

본 발명의 실시예에서, 상기 인체에 대응하는 인체 부위 노드 중의 일부 또는 전부는 "에지"를 통해 물체 노드와 연결되므로, 당해 공간 위치 관계도에 기반하여 상기 인체에 대응하는 특징 정보를 결정할 경우, "에지"의 영향, 즉 물체 노드의 상기 인체 부위 노드에 대한 영향도 끌어 들여 인체의 특징 정보가 물체와 관련되는 공간 위치 정보를 포함하도록 함으로써, 후속적으로 인체와 물체 사이의 인터랙션 관계에 대한 인식에 유리하다.In an embodiment of the present invention, since some or all of the human body part nodes corresponding to the human body are connected to the object node through "edges", when determining the characteristic information corresponding to the human body based on the spatial positional relationship diagram, By drawing in the influence of "edge", that is, the influence of the object node on the human body part node, so that the characteristic information of the human body includes spatial position information related to the object, subsequent recognition of the interaction relationship between the human body and the object advantageous to

단계104에서, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정한다.In step 104, an interaction relationship between the human body and the object is determined based on the spatial positional relationship diagram between the human body and the object.

일부 실시예에서는, 미리 트레이닝된 신경망을 이용하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정할 수 있는 바, 여기서, 인체 부위 노드와 물체 노드를 획득하는 상기 신경망과 구별하기 위하여 인체와 물체 사이의 인터랙션 관계를 결정하는 이 신경망을 제2 신경망이라고 한다. 상기 제2 신경망은 상기 인체와 상기 물체 사이의 인터랙션 관계를 분류한다. 상기 인체에 대응하는 특징 정보 및 상기 물체에 대응하는 특징 정보를 특징 정보 쌍으로 간주하여 상기 제2 신경망에 입력하면, 인터랙션 관계 분류 결과를 예측함으로써, 상기 특징 정보 쌍에 대응하는 물체와 인체의 인터랙션 관계를 결정할 수 있다.In some embodiments, a pre-trained neural network may be used to determine an interaction relationship between the human body and the object, where the interaction between the human body and the object is to be distinguished from the neural network that acquires the human body part node and the object node. This neural network that determines the relationship is called the second neural network. The second neural network classifies an interaction relationship between the human body and the object. When the characteristic information corresponding to the human body and the characteristic information corresponding to the object are regarded as a characteristic information pair and input to the second neural network, an interaction relationship classification result is predicted by predicting the interaction between the object corresponding to the characteristic information pair and the human body. relationship can be determined.

본 발명의 실시예에서는, 처리 대기 이미지의 인체 검출 결과와 물체 검출 결과에 기반하여 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역 및 물체에 대응하는 물체 영역을 결정하고, 대응되는 노드로 전환하며, 상기 노드에 기반하여 사람과 물체의 공간 위치 관계도를 구축하되, 상기 공간 위치 관계도는 상이한 인체 자세에 대응하는 특징을 포함할 뿐만 아니라 각 신체 부위와 물체의 위치 관계도 포함하며, 각 노드의 상기 공간 위치를 이용하여 인체에 대응하는 특징 정보 및 물체에 대응하는 특징 정보를 획득함으로써 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하고 인터랙션 관계 인식의 정확성과 신뢰도를 향상시킨다.In an embodiment of the present invention, based on the human body detection result and the object detection result of the processing standby image, each body part region corresponding to the human body and the object region corresponding to the object are determined in the processing standby image, and switched to a corresponding node; , build a spatial positional relationship diagram between a person and an object based on the node, wherein the spatial positional relationship diagram includes features corresponding to different body postures as well as positional relationships between body parts and objects, and each node By obtaining characteristic information corresponding to the human body and characteristic information corresponding to the object by using the spatial position of

도 2는 본 발명의 적어도 한 실시예에서 제공하는 타깃 검출 방법의 흐름도를 도시하는 바, 당해 타깃 검출 방법을 이용하여 처리 대기 이미지를 검출함으로써 인체 검출 결과와 물체 검출 결과를 획득할 수 있다.2 is a flowchart of a target detection method provided by at least one embodiment of the present invention. A human body detection result and an object detection result can be obtained by detecting an image waiting to be processed using the target detection method.

도 2에 도시된 바와 같이, 미리 트레이닝된 타깃 검출 네트워크(200)를 이용하여 상기 처리 대기 이미지를 검출할 수 있다. 여기서, 타깃 검출 네트워크(200)는 특징 추출 모듈(210), 후보 박스 추출 모듈(220), 풀링 모듈(230), 분류 모듈(240), 좌표 피팅 모듈(250)을 포함한다.As shown in FIG. 2 , the image waiting to be processed may be detected using a pre-trained target detection network 200 . Here, the target detection network 200 includes a feature extraction module 210 , a candidate box extraction module 220 , a pooling module 230 , a classification module 240 , and a coordinate fitting module 250 .

처리 대기 이미지는 우선 특징 추출 모듈(210)에 입력된다. 특징 추출 모듈(210)은 콘볼루션 신경망 모듈 일 수 있고, 이는 복수의 콘볼루션 층을 포함할 수 있으며, 상기 처리 대기 이미지의 비쥬얼 특징, 즉 상기 처리 대기 이미지의 특징 맵(feature maps)을 추출할 수 있다.The image to be processed is first input to the feature extraction module 210 . The feature extraction module 210 may be a convolutional neural network module, which may include a plurality of convolutional layers, and may extract visual features of the image waiting to be processed, that is, feature maps of the image waiting to be processed. can

후보 박스 추출 모듈(220)은 특징 추출 모듈(210)이 출력한 특징 맵에 기반하여 타깃 객체가 나타날 수 있는 일련의 영역을 예측하여 후보 박스로 사용한다. 후보 박스를

으로 표시하되, 여기서,

는 i번째 후보 박스의 정점의 횡좌표를 나타내고

는 정점의 종좌표를 나타낸다.The candidate box extraction module 220 predicts a series of regions in which the target object may appear based on the feature map output by the feature extraction module 210 and uses it as a candidate box. candidate box

, but where,

denotes the abscissa of the vertex of the i-th candidate box,

represents the ordinate of the vertex.

후보 박스 추출 모듈(220)이 예측한 후보 박스에 대하여, 풀링 모듈(230)은 풀링 층을 통해 원본 이미지에서의 후보 박스 영역을 일부 특징 맵에 매핑시키고, 풀링 조작을 통해 고정 크기의 특징을 생성한다. 당해 특징은 동시에 분류 모듈(240)과 좌표 피팅 모듈(250)에 진입하고, 좌표 피팅 모듈(250)은 후보 박스 추출 모듈(220)에서 추출한 후보 박스의 기초상에 후보 박스의 좌표를 회귀시켜 보다 정확한 타깃 후보 박스를 얻으며; 분류 모듈(240)은 상기 후보 박스를 다시 분류하여 사람의 카테고리 또는 구체적인 물체 카테고리를 얻고, 이로써 처리 대기 이미지에 대해 인체 검출 박스와 물체 검출 박스를 획득한다. 도 2에 도시된 바와 같이, 타깃 검출 네트워크(200)가 출력한 이미지에는 인체 검출 박스(261), 및 물체 검출 박스(262, 263)가 포함된다.For the candidate box predicted by the candidate box extraction module 220, the pooling module 230 maps the candidate box region in the original image to some feature maps through the pooling layer, and generates a fixed-size feature through the pooling operation. do. The feature enters the classification module 240 and the coordinate fitting module 250 at the same time, and the coordinate fitting module 250 regresses the coordinates of the candidate box on the basis of the candidate box extracted by the candidate box extraction module 220 get the correct target candidate box; The classification module 240 reclassifies the candidate box to obtain a person category or a specific object category, thereby obtaining a human body detection box and an object detection box for the image waiting to be processed. As shown in FIG. 2 , the image output by the target detection network 200 includes a human body detection box 261 and object detection boxes 262 and 263 .

처리 대기 이미지에 대해 이미지 검출을 수행함으로써 사람과 물체의 처리 대기 이미지에서의 공간 위치 정보 및 비쥬얼 특징을 획득하여 후속적인 단계에서 사람과 물체 사이의 인터랙션 관계를 예측하는 데에 사용할 수 있다.By performing image detection on the image waiting to be processed, spatial location information and visual features in the image waiting to be processed of a person and an object can be obtained and used to predict an interaction relationship between a person and an object in a subsequent step.

일부 실시예에서는 아래 방법을 통해 신체 부위 영역을 결정한다.In some embodiments, the body part region is determined by the method below.

우선, 상기 인체 검출 결과 중의 인체 검출 박스에 포함되는 특징 정보를 획득한다. 예를 들면, 인체 검출 박스를 이용하고 ROI Align(Region of Interest Align, 관심 영역 정렬)을 사용하여 처리 대기 이미지의 특징 맵에서 상기 인체 검출 박스에 포함되는 특징 정보, 즉 인체의 특징 정보를 획득할 수 있다.First, feature information included in the human body detection box among the human body detection results is acquired. For example, by using a human body detection box and using ROI Align (Region of Interest Align), the characteristic information included in the human body detection box, that is, the characteristic information of the human body, can be obtained from the characteristic map of the image waiting to be processed. can

이어서, 상기 특징 정보에 기반하여 상기 인체의 인체 키 포인트를 획득한다. 예를 들면, 상기 인체 검출 박스에 포함되는 특징 정보를 자세 추정 네트워크에 입력할 수 있다. 자세 추정 네트워크는 일련의 콘볼루션 층과 비선형 층으로 구성되고, 이는 자세 카테고리 수량의 채널 특징을 출력하고, 각 채널은 하나의 신뢰도 히트 맵에 대응하며, 각 히트 맵에서 점수가 제일 높은 포인트가 바로 당해 자세 카테고리에 대응하는 인체 키 포인트의 위치이다.Then, the key point of the human body is acquired based on the characteristic information. For example, feature information included in the human body detection box may be input to the posture estimation network. The posture estimation network consists of a series of convolutional and nonlinear layers, which output channel features of posture category quantities, each channel corresponds to one confidence heat map, and the highest-scoring point in each heat map is It is the position of the key point of the human body corresponding to the corresponding posture category.

인체 키 포인트를 획득한 후, 인체 뼈대 정보에 기반하여 상기 인체 키 포인트를 연결하여 연결 정보를 획득할 수 있다. 기설정 또는 미리 획득된 인체 뼈대 정보에 대하여, 각 인체 키 포인트 사이의 연결 방식은 결정된 것인 바, 즉 임의의 인체 키 포인트에 대하여 이와 연결되는 키 포인트를 결정할 수 있다. 상기 인체 키 포인트의 연결 정보는 이와 연결되는 키 포인트 및 연결된 키 포인트의 위치 정보를 포함한다.After acquiring the human body key point, connection information may be acquired by connecting the human body key point based on the human body skeleton information. With respect to preset or pre-obtained human skeleton information, a connection method between key points of the human body is determined, that is, a key point connected thereto may be determined for any key point of the human body. The connection information of the human body key point includes a key point connected thereto and location information of the connected key point.

도 3a는 처리 대기 이미지를 검출하여 얻은 인체 검출 결과를 도시하는 바, 이는 인체 검출 박스(300) 및 인체 검출 박스의 위치를 포함한다. 상기 인체 검출 박스에 포함되는 이미지 부분에 대해 후속적인 인체 부위 영역 결정 단계를 수행할 수 있으며, 상기 인체 검출 박스에 포함되는 이미지를 크롭하고, 크롭된 이미지에 기반하여 후속적인 인체 부위 영역 결정 단계를 수행할 수도 있다.3A shows a human body detection result obtained by detecting a processing standby image, which includes the human body detection box 300 and positions of the human body detection box. A subsequent human body region determination step may be performed on the image portion included in the human body detection box, and the image included in the human body detection box is cropped, and a subsequent human body region region determination step is performed based on the cropped image. can also be done

도 3a에 도시된 인체 검출 박스가 포함하는 특징 정보에 기반하여 검출된 인체에 대응하는 인체 키 포인트를 결정할 수 있으며, 도 3b에 도시된 바와 같다.A human body key point corresponding to the detected human body may be determined based on the feature information included in the human body detection box shown in FIG. 3A , as shown in FIG. 3B .

인체 키 포인트 및 상기 인체 키 포인트의 연결 정보를 획득한 후, 상기 인체 키 포인트와 상기 연결 정보에 기반하여 인체 부위 영역을 결정할 수 있다.After acquiring the human body key point and the connection information of the human body key point, the body part region may be determined based on the human body key point and the connection information.

한 예시에서는, 서로 연결된 복수(예를 들면, 두 개)의 인체 키 포인트에 기반하여 상기 인체 부위 영역을 결정할 수 있다.In one example, the region of the body part may be determined based on a plurality of (eg, two) body key points connected to each other.

도 3b에서 서로 연결된 인체 키 포인트(311)와 인체 키 포인트(312)를 예로 들면, 인체 키 포인트(311)와 인체 키 포인트(312)의 카테고리(예를 들어 각각 무릎 키 포인트와 발목 키 포인트) 및 위치에 기반하여 이 두 인체 키 포인트로 형성된 직사각형 영역이 소퇴 영역이라는 것을 결정할 수 있는 바, 박스 321에 도시한 바와 같다. 기타 인체 부위 영역의 결정은 상술한 방법과 유사하다.Taking the human body key point 311 and the human body key point 312 connected to each other in FIG. 3B as an example, the categories of the human body key point 311 and the human body key point 312 (for example, the knee key point and the ankle key point, respectively) And based on the location, it can be determined that the rectangular region formed by these two body key points is the withdrawal region, as shown in box 321 . The determination of the regions of other body parts is similar to the method described above.

한 예시에서는, 상기 인체 키 포인트를 중심으로 상기 인체 부위 영역을 결정할 수 있다. 예를 들면, 무릎 키 포인트를 중심으로 미리 설치한 무릎 영역의 크기에 따라 무릎 영역의 구체적인 위치를 결정할 수 있다. 기타 인체 부위 영역의 결정은 상기 방법과 유사하다.In one example, the region of the body part may be determined based on the key point of the body. For example, the specific position of the knee region may be determined according to the size of the knee region installed in advance based on the knee key point. The determination of regions of other body parts is similar to the above method.

한 예시에서, 인체 부위 영역의 일부는 서로 연결된 복수의 인체 키 포인트에 따라 결정될 수 있고, 인체 부위 영역의 다른 일부는 이 복수의 인체 키 포인트 중의 하나의 키 포인트를 중심으로 결정될 수 있다. 각 인체 부위 영역의 구체적인 결정 방식은 실제 상황에 따라 결정할 수 있으며 본 발명의 실시예는 이에 대해 한정하지 않는다.In one example, a portion of the body part region may be determined according to a plurality of human key points connected to each other, and the other part of the body part region may be determined based on one key point among the plurality of human body key points. A specific determination method of each body part region may be determined according to an actual situation, and the embodiment of the present invention is not limited thereto.

상기 처리 대기 이미지에서 검출된 인체에 포함되는 각 인체 부위 영역에 대하여, 이를 대응하는 인체 부위 노드로 전환시키는 동시에 물체 검출 결과 중의 물체 검출 박스에 대응하는 영역, 즉 물체 영역도 대응하는 물체 노드로 전환시킬 수 있다. 각 인체 부위 노드와 물체 노드에 기반하여 처리 대기 이미지 내의 인체와 물체 사이의 인터랙션 관계를 결정한다.For each body part region included in the human body detected in the processing standby image, it is converted into a corresponding human body part node, and at the same time, the region corresponding to the object detection box in the object detection result, that is, the object region is also converted into a corresponding object node can do it Based on each human body part node and object node, an interaction relationship between the human body and the object in the image waiting to be processed is determined.

일부 실시예에서는 아래 방식을 통해 인체 부위 영역과 물체 영역을 각각 인체 부위 노드 및 물체 노드로 전환시킬 수 있다.In some embodiments, the body part region and the object region may be converted into a human body part node and an object node, respectively, through the following method.

우선, 인체 부위 영역과 물체 영역의 특징 정보를 획득한다. 예를 들면, 인체 부위 영역과 물체 영역에 기반하여 ROI Align을 사용하여 처리 대기 이미지의 특징 맵에서 상기 인체 부위 영역의 특징 정보 및 물체 영역의 특징 정보를 획득할 수 있다.First, feature information of the body part region and the object region is acquired. For example, by using ROI Align based on the human body region and the object region, characteristic information of the human body region and the characteristic information of the object region may be obtained from the feature map of the image to be processed.

이어서, 상기 인체 부위 영역의 특징 정보 및 물체 영역의 특징 정보에 대해 차원 축소를 수행하여 인체 부위 영역에 대응하는 인체 부위 노드 및 물체 영역에 대응하는 물체 노드의 특징 정보를 획득한다. 예를 들면, 미리 트레이닝된 신경망을 통해 상기 인체 부위 영역과 물체 영역의 특징 정보에 대해 차원 축소를 수행할 수 있다. 후속적으로 사용되는 신경망과 구별하기 위하여 차원 축소를 수행하는 이 신경망을 제1 신경망이라 칭할 수 있다.Then, the dimensionality reduction is performed on the characteristic information of the body part region and the characteristic information of the object region to obtain characteristic information of the human body part node corresponding to the human body part region and the object node corresponding to the object region. For example, dimension reduction may be performed on the feature information of the body part region and the object region through a pre-trained neural network. This neural network that performs dimensionality reduction in order to distinguish it from a neural network used subsequently may be referred to as a first neural network.

인체 부위 노드 및 물체 노드의 특징 정보를 획득한 후, 노드의 카테고리 및 공간 위치 관계에 기반하여 맵을 구축할 수 있는바, 즉 인체와 물체의 공간 위치 관계도를 구축할 수 있다.After acquiring the characteristic information of the human body part node and the object node, a map can be built based on the node category and spatial location relationship, that is, a spatial location relationship diagram between the human body and the object can be built.

일부 실시예에서는 아래 방법을 통해 인체와 물체의 공간 위치 관계를 구축할 수 있다.In some embodiments, the spatial positional relationship between the human body and the object may be established through the following method.

우선, 동일한 인체에 대하여, 인체 뼈대 정보에 따라 각 인체 부위 노드를 연결한다.First, with respect to the same human body, each body part node is connected according to the human body skeleton information.

인체 키 포인트의 연결과 유사하게, 기설정 또는 미리 획득된 인체 뼈대 정보에 따라 각 인체 부위 노드를 연결한다. 임의의 한 인체 부위 노드에 대하여 이와 연결되는 인체 부위 노드를 결정할 수 있다.Similar to the connection of key points of the human body, nodes of each body part are connected according to preset or previously acquired human skeleton information. For any one body part node, a body part node connected thereto may be determined.

이어서, 상기 물체 노드와 상기 인체 부위 노드에 대해 에지 연결을 수행하여 인체와 물체의 공간 위치 관계도를 획득한다. 물체 노드와 인체 부위 노드 사이의 공간 거리에 기반하여 에지 연결을 수행하는 바, 예를 들면, 각 물체 노드에 대하여, 이와 제일 가까운 설정 수량의 인체 부위 노드를 선택하여 에지 연결을 수행할 수 있는 바, 예를 들면 5개의 제일 가까운 인체 부위 노드와 에지 연결을 수행하여 상기 공간 위치 관계도에서의 에지를 구성할 수 있다.Then, an edge connection is performed on the object node and the human body part node to obtain a spatial positional relationship diagram between the human body and the object. Edge connection is performed based on the spatial distance between the object node and the body part node. For example, for each object node, the edge connection can be performed by selecting the closest human body part node with a set quantity. , for example, an edge in the spatial location relationship diagram may be configured by performing edge connection with the five closest human body part nodes.

공간 구조 정보를 충분히 발굴하기 위하여 공간 위치 관계도 내의 각 에지도 특징을 부여할 수 있는바, 예를 들면 연결된 물체 노드와 인체 부위 노드의 상대적 위치에 기반하여 에지의 특징 정보를 결정, 즉 연결된 두 개의 노드의 상대적 위치 좌표의 코드를 에지의 특징 정보로 할 수 있다. 획득된 공간 위치 관계도는 각 인체 부위 노드와 물체 노드의 특징 정보를 포함할 뿐만 아니라 각 에지의 특징 정보도 포함한다.In order to sufficiently discover spatial structure information, each edge in the spatial positional relationship diagram can also be assigned a characteristic. For example, the characteristic information of the edge is determined based on the relative positions of the connected object node and the human body part node, that is, the two connected The code of the relative position coordinates of the nodes may be used as feature information of the edge. The obtained spatial positional relationship diagram includes characteristic information of each body part node and object node as well as characteristic information of each edge.

본 발명의 실시예에서는 인체 부위 노드와 물체 노드에 대해 에지 연결을 수행하고 에지에 특징을 부여하는 것을 통해 인체와 물체 사이의 공간 위치 정보를 명시적으로 구축함으로써 공간 정보와 인체 구조 정보에 대한 표시 능력을 향상시킨다.In an embodiment of the present invention, the spatial information and the human body structure information are displayed by explicitly establishing the spatial location information between the human body and the object by performing edge connection on the human body part node and the object node and assigning features to the edge. Improve your abilities.

구축된 인체와 물체의 공간 위치 관계도에 대하여, 또 아래 방식을 통해 상기 인체에 대응하는 특징 정보 및 상기 물체에 대응하는 특징 정보를 획득할 수 있다.With respect to the constructed spatial positional relationship between the human body and the object, it is possible to obtain characteristic information corresponding to the human body and characteristic information corresponding to the object through the following method.

각 인체 부위 노드에 대하여, 인접한 인체 부위 노드의 특징 정보 및 서로 연결된 에지의 특징 정보를 이용하여 각 상기 인체 노드의 특징 정보를 업데이트 할 수 있다.With respect to each body part node, characteristic information of each human body node may be updated using characteristic information of adjacent human body node nodes and characteristic information of edges connected to each other.

한 예시에서는, 에지가 민감한 그래프 콘볼루션 신경망을 사용하여 각 인체 부위 노드의 특징 정보를 업데이트할 수 있는바, 즉 공간 위치 관계도의 상태를 업데이트 할 수 있다. 그래프 콘볼루션 신경망은 복수의 그래프 콘볼루션 층과 비선형 조작성을 포함하고, 제l 층 그래프 콘볼루션 층에 대하여, 공식(1)을 사용하여 각 인체 부위 노드

의 특징 정보의 업데이트를 나타낼 수 있다.In one example, by using an edge-sensitive graph convolutional neural network, feature information of each human body part node can be updated, that is, the state of the spatial location relationship diagram can be updated. A graph convolutional neural network includes a plurality of graph convolution layers and non-linear manipulability, For the first layer graph convolutional layer, each body part node using formula ( 1 )

may indicate an update of characteristic information of .

여기서,

은 제l +1 층의 출력이고,

은 제l 층의 출력이며,

는

와 인접한 인체 부위 노드이고, N(i)는

의 인접하는 인체 부위 노드 레이블의 집합을 나타내며,

는

와

를 연결하는 에지의 특징을 나타내고, W는

에 대해 완전 연결 조작을 수행하는 함수를 나타내며,

는 특징 차원을 조절하는 매트릭스이고,

는 활성화 함수이며, 예를 들면 sigmoid 또는 relu이다.here,

is the output of the first l + 1st layer,

is the output of the first layer ,

Is

is a node of a body part adjacent to and N( i ) is

represents the set of adjacent human body part node labels of

Is

Wow

represents the characteristics of the edge connecting

Represents a function that performs a fully concatenated operation on

is a matrix controlling the feature dimension,

is the activation function, for example sigmoid or relu.

복수의 그래프 콘볼루션 층의 조작을 거친 후, 각 인체 부위 노드마다 모두 일정한 글로벌 뷰 및 향상된 공간 구조적 표시 능력을 구비하게 된다.After the manipulation of a plurality of graph convolution layers, each node of the human body part has a constant global view and improved spatial and structural display capability.

각 인체 부위 노드의 특징 정보를 획득한 후, 당해 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득할 수 있다.After the characteristic information of each human body part node is acquired, characteristic information corresponding to the human body may be acquired based on the characteristic information.

한 예시에서, 동일한 인체에 대하여, 각 인체 부위 노드의 특징 정보에 대해 글로벌 풀링 조작을 수행하여 상기 인체에 대응하는 특징 정보를 획득한다. 글로벌 풀링 조작을 거쳐 처리 대기 이미지에서 검출된 각 인체에 대해 모두 상응한 특징 정보를 획득할 수 있다.In one example, for the same human body, a global pooling operation is performed on the characteristic information of each human body part node to obtain characteristic information corresponding to the human body. Through the global pooling operation, it is possible to acquire corresponding feature information for each human body detected in the image waiting to be processed.

처리 대기 이미지에서 검출된 물체에 대하여, 각 물체 노드의 특징 정보에 기반하여 상기 물체에 대응하는 특징 정보를 획득할 수 있다. 통상적인 경우, 하나의 물체는 하나의 물체 노드에 대응되므로 각 물체 노드의 특징 정보에 기반하여 각 물체에 대응하는 특징 정보를 획득할 수 있다.With respect to the object detected in the processing standby image, characteristic information corresponding to the object may be acquired based on characteristic information of each object node. In general, since one object corresponds to one object node, it is possible to obtain characteristic information corresponding to each object based on characteristic information of each object node.

처리 대기 이미지에서 검출된 각 물체 및 각 인체의 특징 정보를 획득하는 것을 통해 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정할 수 있다.An interaction relationship between the human body and the object may be determined by acquiring each object detected from the processing standby image and characteristic information of each human body.

일부 실시예에서는, 미리 트레이닝된 신경망을 이용하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정할 수 있는바, 여기서, 상기 신경망은 상기 인체와 상기 물체 사이의 인터랙션 관계를 분류하는 데에 사용된다. 상술한 신경망과 구별하기 위하여 인터랙션 관계를 결정하기 위한 당해 신경망을 제2 신경망이라 칭할 수 있다.In some embodiments, an interaction relationship between the human body and the object may be determined using a pre-trained neural network, wherein the neural network is used to classify the interaction relationship between the human body and the object. In order to distinguish it from the above-described neural network, the neural network for determining an interaction relationship may be referred to as a second neural network.

인체와 물체 사이의 인터랙션을 라벨링한 이미지 샘플을 통해 제2 신경망을 트레이닝하여 상기 제2 신경망으로 하여금 처리 대기 이미지 내의 인체와 물체 사이의 인터랙션 관계를 분류하도록 함으로써 인체와 물체 사이의 인터랙션 관계를 결정할 수 있다. 예를 들면, 현실 생활 장면에서 흔히 볼 수 있고 실용적인 의미가 있는 10가지 인체와 물체 사이의 인터랙션 관계, 예를 들면 흡연, 물을 마심, 음주, 자전거를 탐, 전화를 함 등을 통계하는 동시에, 이 10가지 일상 생활 장면을 포함하는 데이터 베이스를 수집하여, 당해 제2 신경망을 당해 데이터 베이스에서 트레이닝시켜 이로 하여금 신속하고 정확하게 이 10가지 인터랙션 관계를 분류하도록 할 수 있다.By training a second neural network using an image sample labeled for an interaction between a human body and an object, the second neural network can classify the interaction relationship between the human body and the object in the image waiting to be processed, thereby determining the interaction relationship between the human body and the object there is. For example, while statistically stating the interaction relationships between the human body and objects in 10 common and practical meanings in real life scenes, such as smoking, drinking water, drinking alcohol, riding a bicycle, and making a phone call, By collecting a database including these 10 daily life scenes, the second neural network can be trained on the database to quickly and accurately classify these 10 interaction relationships.

일부 실시예에서는, 각 타입의 인터랙션 관계에 대해 안전 계수를 설정할 수 있다. 예를 들면, 상기 10가지 인터랙션 관계에 대하여, 안전 정도에 따라 상응한 안전 계수를 설정할 수 있다. 후에 정의되는 안전 계수와 구별하기 위하여 여기의 안전 계수를 제1 안전 계수라 칭할 수 있다. 예를 들면, "불과 접촉"하는 안전 계수를 0.2로 설정하고, "물을 마심"의 안전 계수를 0.6으로 설정하는 것 등이다.In some embodiments, a safety factor may be set for each type of interaction relationship. For example, with respect to the ten interaction relationships, a corresponding safety factor may be set according to the degree of safety. In order to distinguish it from a safety factor defined later, this safety factor may be referred to as a first safety factor. For example, setting the safety factor for "contact with fire" to 0.2, setting the safety factor for "drinking water" to 0.6, and so on.

상기 처리 대기 이미지에서 적어도 한 쌍의 인체와 물체의 제1 안전 계수가 제1 설정 임계값보다 낮은 것에 응답하여, 상기 인체와 물체의 인터랙션 관계를 타깃 인터랙션 관계로 결정한다. 즉, 만약 처리 대기 이미지에서 안전 계수가 제1 설정 임계값보다 낮은 인터랙션 관계를 검출하면, 상기 인체가 타깃 장면에 처하는 것을 결정할 수 있다. 예를 들면, 제1 설정 임계값이 0.3인 경우, 상기 예에 대하여, 인체와 물체의 인터랙션 관계가 "불과 접촉"하는 것에 속하는 것으로 결정될 경우, 인체가 위험 장면에 처해 있다고 결정할 수 있다.In response to the first safety factor of the at least one pair of the human body and the object being lower than a first set threshold value in the waiting image to be processed, the interaction relationship between the human body and the object is determined as the target interaction relationship. That is, if an interaction relationship in which the safety factor is lower than the first set threshold is detected in the processing waiting image, it may be determined that the human body is placed in the target scene. For example, when the first set threshold value is 0.3, when it is determined that the interaction relationship between the human body and the object belongs to "contact with fire" for the above example, it may be determined that the human body is in a dangerous scene.

이상의 방법은 처리 대기 이미지에 위험 정도가 높은 인터랙션 관계가 존재하는지 여부를 검출하는 데에 사용될 수 있다. 예를 들면, 모니터링 이미지에 대하여, 안전 계수가 제1 설정 임계값보다 낮은 인터랙션 관계를 검출할 경우, 이미지 내의 사람이 위험 장면에 처해 있다고 판정하고 경보를 트리거한다.The above method may be used to detect whether an interaction relationship with a high degree of risk exists in the image waiting to be processed. For example, for the monitoring image, when detecting an interaction relationship in which the safety factor is lower than a first set threshold, it is determined that a person in the image is in a dangerous scene and an alarm is triggered.

상이한 타입의 인터랙션 관계 사이의 조합에 대해서도 안전 계수를 설정할 수 있는바, 여기서 이를 제2 안전 계수라 칭할 수 있다. 예를 들면, "물을 마심"과 "자전거를 탐"의 조합에 대하여, 제2 안전 계수를 0.2로 설치할 수 있고, "전화를 함"과 "흡연"의 조합에 대하여, 제2 안전 계수를 0.6으로 설치하는 것 등이다. 본 기술분야의 통상의 기술자들은, 여기의 조합은 두 개의 인터랙션 관계 사이의 조합일 수도 있고 세 개, 심지어 더 많은 인터랙션 관계 사이의 조합일 수도 있으며 본 발명은 이에 대해 한정하지 않는다.A safety factor may also be set for a combination between different types of interaction relationships, which may be referred to as a second safety factor herein. For example, for the combination of "drinking water" and "riding a bicycle", the second safety factor may be set to 0.2, and for the combination of "making a phone call" and "smoking", the second safety factor may be set Installing with 0.6, etc. Those skilled in the art will recognize that the combination herein may be a combination between two interaction relationships or a combination between three or even more interaction relationships, but the present invention is not limited thereto.

동일한 인체에 대하여, 상기 인체와 각 물체 사이의 인터랙션 관계 및 상응한 제2 안전 계수를 획득한다. 즉, 인체와 모든 물체가 발생하는 인터랙션 관계를 결정하고, 이에 대응하는 제2 안전 계수를 결정한다.For the same human body, an interaction relationship between the human body and each object and a corresponding second safety factor are obtained. That is, an interaction relationship between the human body and all objects is determined, and a corresponding second safety factor is determined.

상기 제2 안전 계수가 제2 설정 임계값보다 낮은 것에 응답하여, 상기 인체가 타깃 장면에 처해 있는 것을 결정한다. 즉, 만약 처리 대기 이미지에서 하나의 인체와 복수의 물체의 인터랙션 관계 조합에 대응하는 제2 안전 계수가 제2 설정 임계값보다 낮은 것을 검출하면, 상기 인체가 타깃 장면에 처해 있는 것을 결정할 수 있다. 예를 들면, 제2 설정 임계값이 0.5인 경우, 상기 예에 대하여 동시에 물을 마시고 자전거를 타는 인체를 타깃 장면에 처해 있는 것으로 결정할 수 있다.In response to the second safety factor being lower than a second set threshold, it is determined that the human body is in the target scene. That is, if it is detected that the second safety factor corresponding to the interaction relation combination of one human body and a plurality of objects in the processing standby image is lower than the second set threshold, it may be determined that the human body is in the target scene. For example, when the second threshold value is 0.5, it may be determined that a human body drinking water and riding a bicycle at the same time is in the target scene with respect to the above example.

이상의 방법은 처리 대기 이미지에 잠재적인 위험이 있는 인터랙션 관계가 존재하는지 여부를 검출하는 데에 사용될 수 있다. 예를 들면, 만약 처리 대기 이미지에서 하나의 객체가 현재 운전 및 전화를 하는 것을 검출하면, 즉 검출된 제2 안전 계수가 제2 설정 임계값보다 낮으면 당해 객체가 위험 장면에 처한 것으로 결정하고 경보를 트리거 할 수 있다.The above method may be used to detect whether an interaction relationship with a potential risk exists in the image waiting to be processed. For example, if it is detected that one object is currently driving and making a phone call in the processing waiting image, that is, if the detected second safety factor is lower than the second set threshold value, it is determined that the object is in a dangerous scene, and an alert can be triggered.

단독적으로 수행할 때 안전 계수가 높은 일부 동작에 대하여, 동시에 수행할 경우, 사실 매우 위험한 바, 본 발명의 실시예는 이러한 위험한 장면을 인식하고 제때에 경보를 발송하여 안전성을 향상시킬 수 있다.For some operations with a high safety factor when performed alone, when performed simultaneously, it is actually very dangerous. The embodiment of the present invention can improve safety by recognizing such a dangerous scene and sending an alert in a timely manner.

도 4는 본 발명의 적어도 한 실시예에서 인터랙션 관계 인식 장치를 제공하는 바, 도 4에 도시된 바와 같이, 당해 장치는, 처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득하는 획득 유닛(401); 상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정하고, 상기 물체 검출 결과에 기반하여 상기 처리 대기 이미지에서 물체에 대응하는 물체 영역을 결정하되, 여기서, 상기 물체 영역은 상기 물체 검출 결과 중의 물체 검출 박스에 대응하는 영역인 제1 결정 유닛(402); 상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정하되, 여기서, 상기 공간 위치 관계도는 각각의 상기 인체 부위 노드의 특징 정보, 상기 물체 노드의 특징 정보 및 각각의 상기 인체 부위 노드와 상기 물체 노드의 위치 관계 정보를 포함하는 제2 결정 유닛(403); 및 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 인식 유닛(404)을 포함할 수 있다.Fig. 4 is an embodiment of the present invention that provides an apparatus for recognizing an interaction relationship. As shown in Fig. 4, the apparatus includes an acquisition unit configured to detect an image waiting to be processed to obtain a human body detection result and an object detection result. (401); Each body part region corresponding to the human body is determined in the processing standby image based on the human body detection result, and an object region corresponding to the object in the processing standby image is determined based on the object detection result, wherein the object a first determining unit 402, wherein the area is an area corresponding to the object detection box in the object detection result; A spatial positional relationship diagram between the human body and the object is determined based on a human body part node corresponding to the human body part region and an object node corresponding to the object region, wherein the spatial positional relationship diagram is a feature of each human body part node a second determining unit (403) including information, characteristic information of the object node, and positional relationship information of each of the human body part nodes and the object nodes; and a recognition unit 404 that determines an interaction relationship between the human body and the object based on a spatial positional relationship diagram between the human body and the object.

일부 실시예에서, 제1 결정 유닛(402)은 구체적으로, 상기 인체 검출 결과 중의 인체 검출 박스에 포함되는 특징 정보를 획득하고; 상기 특징 정보에 기반하여 상기 인체의 인체 키 포인트를 획득하며; 인체 뼈대 정보에 기반하여 상기 인체 키 포인트를 연결하여 연결 정보를 획득하고; 상기 인체 키 포인트 및 연결 정보에 기반하여 각 인체 부위 영역을 결정하는 것은, 서로 연결된 복수의 인체 키 포인트에 기반하여 하나의 인체 부위 영역을 결정하거나, 또는 상기 복수의 인체 키 포인트의 하나를 중심으로 하나의 인체 부위 영역을 결정하는 것 중 적어도 하나를 포함하는 데에 사용된다.In some embodiments, the first determining unit 402 is specifically configured to: acquire feature information included in a human body detection box in the human body detection result; acquiring a key point of the human body based on the characteristic information; connecting the key points of the human body based on the human skeleton information to obtain connection information; Determining each body part region based on the human body key point and the connection information may include determining one body part region based on a plurality of human body key points connected to each other, or using one of the plurality of human body key points as a center. used to include at least one of determining a region of one body part.

일부 실시예에서, 제2 결정 유닛(403)은 구체적으로, 상기 인체 부위 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 인체 부위 노드의 특징 정보를 획득하고; 상기 물체 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 물체 노드의 특징 정보를 획득하며; 동일한 인체에 대하여, 인체 뼈대 정보에 기반하여 각각의 상기 인체 부위 노드를 연결하고; 상기 물체 노드와 상기 인체 부위 노드를 연결시켜 인체와 물체의 공간 위치 관계도를 획득하는 것은 각 물체 노드에 대하여, 상기 물체 노드와의 거리가 제일 가까운 설정 수량의 인체 부위 노드를 상기 물체 노드와 각각 연결하는 것을 포함하되, 여기서, 하나의 물체 노드와 하나의 인체 부위 노드가 연결되어 형성된 에지의 특징 정보는 상기 에지가 연결한 물체 노드와 인체 부위 노드의 위치 관계 정보를 포함하는 데에 사용된다.In some embodiments, the second determining unit 403 is specifically configured to perform dimension reduction on the characteristic information of the human body part region to obtain characteristic information of the human body part node; performing dimension reduction on the characteristic information of the object region to obtain characteristic information of the object node; for the same human body, connecting each of the human body part nodes based on the human body skeleton information; Connecting the object node and the human body part node to obtain a spatial positional relationship diagram between the human body and the object means, for each object node, a set quantity of human body part nodes having the closest distance to the object node with the object node, respectively. Connecting, wherein the characteristic information of an edge formed by connecting one object node and one body part node is used to include positional relationship information between the object node and the body part node connected by the edge.

일부 실시예에서, 상기 장치는, 각 인체 부위 노드에 대하여, 상기 인체 부위 노드의 하나 또는 복수의 인접하는 인체 부위 노드의 특징 정보 및 상기 인체 부위 노드와 상기 인접하는 인체 부위 노드를 서로 연결한 에지의 특징 정보를 이용하여 각각의 상기 인체 부위 노드의 특징 정보를 업데이트하는 업데이트 유닛을 더 포함한다.In some embodiments, the device includes, for each body part node, characteristic information of one or a plurality of adjacent body part nodes of the body part node, and an edge connecting the body part node and the adjacent body part node to each other. and an update unit configured to update the characteristic information of each of the human body part nodes by using the characteristic information of .

일부 실시예에서, 인식 유닛(404)은 구체적으로, 상기 인체 부위 노드의 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득하는 것이, 동일한 인체에 대하여, 각 인체 부위 노드의 특징 정보에 대해 글로벌 풀링 조작을 수행하여 상기 인체에 대응하는 특징 정보를 획득하는 것을 포함하고; 상기 물체 노드의 특징 정보에 기반하여 상기 물체에 대응하는 특징 정보를 획득하며; 상기 인체에 대응하는 특징 정보 및 상기 물체에 대응하는 특징 정보에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 데에 사용된다.In some embodiments, the recognition unit 404 is specifically configured to: obtain the characteristic information corresponding to the human body based on the characteristic information of the human body part node, for the same human body, for the characteristic information of each human body part node performing a global pulling operation to obtain feature information corresponding to the human body; acquiring characteristic information corresponding to the object based on the characteristic information of the object node; It is used to determine an interaction relationship between the human body and the object based on the characteristic information corresponding to the human body and the characteristic information corresponding to the object.

일부 실시예에서, 인식 유닛(404)은 구체적으로, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 데에 사용되고; 상기 장치는, 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 안전 계수가 제1 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 제3 결정 유닛을 더 포함한다.In some embodiments, the recognition unit 404 is specifically used to determine an interaction category to which an interaction relationship between the human body and the object belongs, based on the spatial positional relationship diagram of the human body and the object; The apparatus further includes a third determining unit, configured to determine that the human body is placed in the target scene in response to a safety factor of an interaction category to which the interaction relationship between the human body and the object belongs is lower than a first set threshold value.

일부 실시예에서, 인식 유닛(404)은 구체적으로, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 데에 사용되고; 상기 장치는, 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 조합의 안전 계수를 결정하고; 상기 조합의 안전 계수가 제2 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 제4 결정 유닛을 더 포함한다.In some embodiments, the recognition unit 404 is specifically used to determine an interaction category to which an interaction relationship between the human body and an object of a different category belongs, based on the spatial positional relationship diagram of the human body and the object; The apparatus is configured to: determine a safety factor of a combination of interaction categories to which an interaction relationship between the human body and an object of a different category belongs; and a fourth determining unit, configured to determine that the human body is placed in the target scene in response to the safety factor of the combination being lower than a second set threshold value.

도 5는 본 발명의 적어도 한 실시예에서 제공하는 전자 기기를 도시하는 바, 상기 기기는 메모리(501), 프로세서(502)를 포함하고, 상기 메모리(501)는 프로세서(502)에서 실행할 수 있는 컴퓨터 명령을 저장하며, 상기 프로세서(502)는 상기 컴퓨터 명령을 실행할 경우, 본 명세서의 임의의 한 실시예에 따른 인터랙션 관계 인식 방법을 구현한다.5 shows an electronic device provided by at least one embodiment of the present invention, wherein the device includes a memory 501 and a processor 502 , wherein the memory 501 can be executed by the processor 502 . It stores a computer instruction, and when the processor 502 executes the computer instruction, it implements the interaction relationship recognition method according to any one embodiment of the present specification.

본 명세서의 적어도 하나의 실시예는 컴퓨터 프로그램이 저장되는 컴퓨터 판독 가능 저장 매체를 더 제공하는 바, 상기 프로그램이 프로세서에 의해 실행될 경우, 본 명세서의 임의의 한 실시예에 따른 타깃 검출 방법을 구현하거나 및/또는 본 명세서의 임의의 한 실시예에 따른 인터랙션 관계 인식 방법을 구현한다.At least one embodiment of the present specification further provides a computer-readable storage medium in which a computer program is stored. When the program is executed by a processor, the target detection method according to any one embodiment of the present specification is implemented or and/or a method for recognizing an interaction relationship according to any one embodiment of the present specification.

본 명세서의 적어도 하나의 실시예는 컴퓨터 프로그램을 더 제공하는 바, 상기 프로그램이 프로세서에 의해 실행될 경우, 본 명세서의 임의의 한 실시예에 따른 타깃 검출 방법을 구현하거나 및/또는 본 명세서의 임의의 한 실시예에 따른 인터랙션 관계 인식 방법을 구현한다.At least one embodiment of the present specification further provides a computer program, wherein when the program is executed by a processor, the target detection method according to any one embodiment of the present specification is implemented and/or any of the present specification A method for recognizing an interaction relationship according to an embodiment is implemented.

본 기술분야의 통상의 기술자는 본 명세서의 하나 이상의 예가 방법, 시스템, 또는 컴퓨터 프로그램 제품으로서 제공될 수 있다는 것을 이해해야 한다. 따라서, 본 명세서의 하나 이상의 예는 완전한 하드웨어 예, 완전한 소프트웨어 예, 또는 소프트웨어와 하드웨어를 조합한 예의 형태를 채택할 수 있다. 또한, 본 명세서의 하나 이상의 예는 컴퓨터 사용가능 프로그램 코드들을 포함하는 하나 이상의 컴퓨터 사용가능 저장 매체(디스크 스토리지, CD-ROM, 광학 스토리지 등을 포함할 수 있지만 이에 한정되지 않음)를 갖는 컴퓨터 프로그램 제품의 형태일 수 있다.Those of ordinary skill in the art should understand that one or more examples herein may be provided as a method, system, or computer program product. Accordingly, one or more examples herein may take the form of a complete hardware example, a complete software example, or a combination of software and hardware. Additionally, one or more examples herein are computer program products having one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. may be in the form of

본 명세서의 다양한 예들은 점진적인 방식으로 설명되고, 다양한 예들 사이의 동일하거나 유사한 부분들은 서로 참조될 수 있고, 각각의 예는 다른 예들과의 차이점들에 초점을 맞춘다. 특히, 데이터 처리 디바이스 예에 대해서는, 기본적으로 본 방법 예와 유사하기 때문에, 설명은 비교적 간단하고, 관련 부분에 대해서는, 본 방법 예의 설명의 일부를 참조할 수 있다.The various examples herein are described in a progressive manner, and identical or similar portions between the various examples may be referenced to each other, with each example focusing on differences from other examples. In particular, with respect to the data processing device example, since it is basically similar to the present method example, the description is relatively simple, and for related parts, reference may be made to a part of the description of the present method example.

전술한 것은 본 명세서의 특정 예들을 설명하였다. 다른 예들은 첨부된 청구항들의 범위 내에 있다. 일부 경우들에서, 청구항들에 설명된 액션들 또는 단계들은 예들에서와 상이한 순서로 수행될 수 있고, 여전히 원하는 결과들을 달성할 수 있다. 또한, 도면들에 도시된 프로세스들은 원하는 결과를 달성하기 위해 도시된 특정 순서 또는 순차적 순서를 반드시 요구하지 않는다. 일부 예들에서, 멀티태스킹 및 병렬 처리가 또한 가능하거나 유리할 수 있다.The foregoing has described specific examples of this specification. Other examples are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the examples and still achieve desired results. Furthermore, the processes depicted in the figures do not necessarily require the specific order shown or sequential order to achieve the desired result. In some examples, multitasking and parallel processing may also be possible or advantageous.

본 명세서에서 설명된 주제 및 기능적 동작들의 예들은 디지털 전자 회로들, 카테고리 타입의 컴퓨터 소프트웨어 또는 펌웨어, 본 출원에 개시된 구조들 및 그들의 구조적 등가물들을 포함할 수 있는 컴퓨터 하드웨어, 또는 이들 중 하나 이상에서 구현될 수 있다. 본 명세서에서 설명된 주제의 예들은 하나 이상의 컴퓨터 프로그램, 즉, 데이터 처리 디바이스에 의해 실행되거나 데이터 처리 디바이스의 동작을 제어하기 위해 타입의 비일시적 프로그램 캐리어에서 인코딩되는 컴퓨터 프로그램 명령어들 내의 하나 이상의 유닛으로서 구현될 수 있다. 대안적으로 또는 그에 부가하여, 프로그램 명령어들은 정보를 인코딩하고 이를 데이터 처리 장치에 의해 수행되도록 적당한 수신기 디바이스로 송신하기 위해 생성되는, 기계-발생 전기, 광, 또는 전자기 신호와 같은, 인위적으로 발생된 전파 신호로 인코딩될 수 있다. 컴퓨터 저장 매체는 머신 판독가능 저장 디바이스, 머신 판독가능 저장 기판, 랜덤 또는 직렬 액세스 메모리 디바이스, 또는 이들 중 하나 이상의 조합일 수 있다.Examples of the subject matter and functional operations described herein are implemented in digital electronic circuits, computer software or firmware of a categorical type, computer hardware, which may include the structures disclosed herein and structural equivalents thereof, or implementation in one or more of these. can be Examples of subject matter described herein are as one or more computer programs, ie, one or more units in computer program instructions executed by a data processing device or encoded in a tangible non-transitory program carrier for controlling the operation of the data processing device. can be implemented. Alternatively or in addition, the program instructions may be generated by an artificially generated, such as a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information and transmit it to a suitable receiver device for execution by a data processing apparatus. It can be encoded into a radio signal. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more thereof.

본 출원에서 설명되는 처리 및 로직 흐름들은 입력 데이터에 따라 동작하고 출력을 생성함으로써 대응하는 기능들을 수행하기 위해 하나 이상의 컴퓨터 프로그램을 실행하는 하나 이상의 프로그램 가능 컴퓨터에 의해 실행될 수 있다. 처리 및 로직 흐름은 또한 FPGA(Field Programmable Gate Array) 또는 ASIC(Application Specific Integrated Circuit)와 같은 전용 로직 회로에 의해 실행될 수 있고, 디바이스는 또한 전용 로직 회로로서 구현될 수 있다.The processing and logic flows described herein may be executed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processing and logic flows may also be executed by dedicated logic circuits such as Field Programmable Gate Arrays (FPGAs) or Application Specific Integrated Circuits (ASICs), and the device may also be implemented as dedicated logic circuits.

컴퓨터 프로그램을 실행하기에 적합한 컴퓨터는, 예를 들어, 범용 및/또는 특수 타깃 마이크로프로세서, 또는 임의의 다른 타입의 중앙 처리 유닛을 포함할 수 있다. 일반적으로, 중앙 처리 유닛은 판독 전용 메모리 및/또는 랜덤 액세스 메모리로부터 명령어들 및 데이터를 수신할 것이다. 컴퓨터의 기본 컴포넌트들은 명령어들을 구현하거나 실행하기 위한 중앙 처리 유닛 및 명령어들 및 데이터를 저장하기 위한 하나 이상의 메모리 디바이스를 포함할 수 있다. 일반적으로, 컴퓨터는 또한 자기 디스크, 광자기 디스크, 또는 광 디스크 등과 같은, 데이터를 저장하기 위한 하나 이상의 대용량 저장 디바이스를 포함할 것이고, 또는 컴퓨터는 데이터를 수신하거나 데이터를 그것에 전송하거나, 또는 둘 다를 위해 이 대용량 저장 디바이스와 동작적으로 조합될 것이다. 그러나, 컴퓨터는 이러한 장비를 가질 필요는 없다. 또한, 컴퓨터는 몇 개의 예를 들자면, 휴대 전화, PDA(personal digital assistant), 모바일 오디오 또는 비디오 플레이어, 게임 콘솔, GPS(global positioning system) 수신기, 또는 USB(universal serial bus) 플래시 드라이브와 같은 다른 디바이스에 내장될 수 있다.A computer suitable for executing a computer program may include, for example, a general purpose and/or special target microprocessor, or any other type of central processing unit. Generally, the central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include one or more mass storage devices for storing data, such as a magnetic disk, magneto-optical disk, or optical disk, or the like, or the computer receives data or transmits data to it, or both. will be operatively combined with this mass storage device for However, the computer need not have such equipment. A computer may also be a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or other device such as a universal serial bus (USB) flash drive, to name a few. can be embedded in

컴퓨터 프로그램 명령어들 및 데이터를 저장하기에 적합한 컴퓨터 판독가능 매체는 모든 형태의 비일시적 메모리, 매체 및 메모리 디바이스들, 예컨대 반도체 메모리 디바이스들(예를 들어, EPROM들, EEPROM들 및 플래시 메모리 디바이스들), 자기 디스크들(예를 들어, 내부 하드 디스크들 또는 이동식 디스크들), 광자기 디스크들, CD ROM들 및 DVD-ROM 디스크들을 포함할 수 있다. 프로세서 및 메모리는 전용 로직 회로에 의해 보충되거나 전용 로직 회로에 통합될 수 있다.A computer-readable medium suitable for storing computer program instructions and data includes all forms of non-transitory memory, media and memory devices, such as semiconductor memory devices (eg, EPROMs, EEPROMs and flash memory devices). , magnetic disks (eg, internal hard disks or removable disks), magneto-optical disks, CD ROMs, and DVD-ROM disks. The processor and memory may be supplemented by or incorporated into dedicated logic circuitry.

본 명세서가 많은 특정 구현 상세들을 포함하지만, 이들은 임의의 개시된 범위 또는 청구된 범위를 제한하는 것으로 해석되어서는 안 되고, 주로 개시된 특정 예들의 특징들을 설명하기 위해 사용된다. 본 명세서의 복수의 예에서 설명된 특정 특징들은 또한 단일 예에서 조합하여 구현될 수 있다. 한편, 단일 예에서 설명된 다양한 특징들은 또한 복수의 예에서 개별적으로 또는 임의의 적절한 하위-조합으로 구현될 수 있다. 또한, 특징들이 위에서 설명된 바와 같이 그리고 심지어 원래 청구된 바와 같이 특정 조합들로 기능할 수 있지만, 청구된 조합으로부터의 하나 이상의 특징은 일부 경우들에서 조합으로부터 제거될 수 있고, 청구된 조합은 하위-조합 또는 하위-조합의 변형을 지칭할 수 있다.Although this specification contains many specific implementation details, these should not be construed as limiting any disclosed or claimed scope, but are primarily used to describe features of the specific examples disclosed. Certain features described in multiple examples herein may also be implemented in combination in a single example. On the other hand, various features described in a single example may also be implemented in a plurality of examples individually or in any suitable sub-combination. Also, although features may function in certain combinations as described above and even as originally claimed, one or more features from a claimed combination may in some cases be eliminated from the combination, and the claimed combination may be - may refer to a variant of a combination or sub-combination.

유사하게, 동작들이 도면들에서 특정 순서로 도시되지만, 이는 이러한 동작들이 도시된 특정 순서로 또는 순차적으로 수행될 것을 요구하거나, 모든 예시된 동작들이 원하는 결과를 달성하기 위해 수행될 것을 요구하는 것으로 해석되지 않아야 한다. 일부 경우들에서, 멀티태스킹 및 병렬 처리가 유리할 수 있다. 또한, 예들에서의 다양한 시스템 유닛들 및 컴포넌트들의 분리는 모든 예들에서 그러한 분리를 요구하는 것으로 이해되어서는 안 되고, 설명된 프로그램 컴포넌트들 및 시스템들은 일반적으로 단일 소프트웨어 제품에 통합되거나, 복수의 소프트웨어 제품들로 패키징될 수 있다는 것을 이해해야 한다.Similarly, although acts are shown in a particular order in the figures, this is to be interpreted as requiring that such acts be performed in the specific order or sequentially shown, or requiring all illustrated acts to be performed to achieve a desired result. shouldn't be In some cases, multitasking and parallel processing may be advantageous. Furthermore, the separation of various system units and components in the examples should not be construed as requiring such separation in all examples, and the described program components and systems are generally integrated into a single software product, or a plurality of software products. It should be understood that they can be packaged into

따라서, 주제의 특정 예들에 대해서 설명했다. 다른 예들은 첨부된 청구항들의 범위 내에 있다. 일부 경우들에서, 청구항들에 언급된 액션들은 상이한 순서로 수행되고 여전히 원하는 결과들을 달성할 수 있다. 또한, 도면들에 도시된 프로세스들은 원하는 결과를 달성하기 위해 도시된 특정 순서 또는 순차적인 순서일 필요는 없다. 일부 구현들에서, 멀티태스킹 및 병렬 처리가 유리할 수 있다.Accordingly, specific examples of the subject matter have been described. Other examples are within the scope of the appended claims. In some cases, the actions recited in the claims may be performed in a different order and still achieve desired results. Moreover, the processes depicted in the figures need not be in the particular order shown or sequential order to achieve a desired result. In some implementations, multitasking and parallel processing may be advantageous.

Claims

인터랙션 관계 인식 방법으로서,
처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득하는 단계;
상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정하는 단계;
상기 물체 검출 결과에 기반하여 상기 처리 대기 이미지에서 물체에 대응하는 물체 영역을 결정하는 단계 - 상기 물체 영역은 상기 물체 검출 결과 중의 물체 검출 박스에 대응하는 영역임 -;
상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정하는 단계 - 상기 공간 위치 관계도는 각각의 상기 인체 부위 노드의 특징 정보, 상기 물체 노드의 특징 정보 및 각각의 상기 인체 부위 노드와 상기 물체 노드의 위치 관계 정보를 포함함 -; 및
인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.A method for recognizing an interaction relationship, comprising:
detecting a processing standby image to obtain a human body detection result and an object detection result;
determining each body part region corresponding to the human body in the processing standby image based on the human body detection result;
determining an object area corresponding to an object in the processing standby image based on the object detection result, wherein the object area is an area corresponding to an object detection box in the object detection result;
determining a spatial position relation diagram between a human body and an object based on a human body part node corresponding to the human body part region and an object node corresponding to the object region; , including feature information of the object node and positional relationship information between each of the human body part nodes and the object node; and
Determining an interaction relationship between the human body and the object based on a spatial positional relationship diagram between the human body and the object
Interaction relationship recognition method, characterized in that.

제1항에 있어서,
상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정하는 단계는,
상기 인체 검출 결과 중의 인체 검출 박스에 포함되는 특징 정보를 획득하는 단계;
상기 특징 정보에 기반하여 상기 인체의 인체 키 포인트를 획득하는 단계;
인체 뼈대 정보에 기반하여 상기 인체 키 포인트를 연결하여 연결 정보를 획득하는 단계; 및
상기 인체 키 포인트 및 연결 정보에 기반하여 각 인체 부위 영역을 결정하는 단계를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.According to claim 1,
The step of determining each body part region corresponding to the human body in the processing standby image based on the human body detection result,
acquiring characteristic information included in a human body detection box among the human body detection results;
acquiring a key point of the human body based on the characteristic information;
acquiring connection information by linking the key points of the human body based on human skeleton information; and
Determining each body part region based on the body key point and connection information
Interaction relationship recognition method, characterized in that.

제2항에 있어서,
상기 인체 키 포인트 및 연결 정보에 기반하여 각 인체 부위 영역을 결정하는 단계는,
서로 연결된 복수의 인체 키 포인트에 기반하여 하나의 인체 부위 영역을 결정하는 단계, 또는
상기 복수의 인체 키 포인트의 하나를 중심으로 하나의 인체 부위 영역을 결정하는 단계 중 적어도 하나를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.3. The method of claim 2,
The step of determining each body part region based on the human body key point and connection information includes:
determining one body part region based on a plurality of body key points connected to each other; or
at least one of determining one body part region based on one of the plurality of human body key points
Interaction relationship recognition method, characterized in that.

제1항에 있어서,
상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정하는 상기 단계는,
상기 인체 부위 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 인체 부위 노드의 특징 정보를 획득하는 단계;
상기 물체 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 물체 노드의 특징 정보를 획득하는 단계;
동일한 인체에 대하여, 인체 뼈대 정보에 기반하여 각각의 상기 인체 부위 노드를 연결하는 단계; 및
상기 물체 노드와 상기 인체 부위 노드를 연결시켜 인체와 물체의 공간 위치 관계도를 획득하는 단계 - 하나의 물체 노드와 하나의 인체 부위 노드가 연결되어 형성된 에지의 특징 정보는 상기 에지가 연결한 물체 노드와 인체 부위 노드의 위치 관계 정보를 포함함 - 를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.According to claim 1,
The step of determining the spatial positional relationship between the human body and the object based on the human body part node corresponding to the body part region and the object node corresponding to the object region,
obtaining characteristic information of the human body part node by performing dimension reduction on the characteristic information of the human body part region;
obtaining characteristic information of the object node by performing dimension reduction on the characteristic information of the object region;
for the same human body, connecting each of the human body part nodes based on the human body skeleton information; and
connecting the object node and the human body part node to obtain a spatial positional relationship diagram between the human body and the object and the positional relationship information of the human body part node - including
Interaction relationship recognition method, characterized in that.

제4항에 있어서,
상기 물체 노드와 상기 인체 부위 노드를 연결시키는 단계는,
각 물체 노드에 대하여, 상기 물체 노드와의 거리가 제일 가까운 설정 수량의 인체 부위 노드를 상기 물체 노드와 각각 연결하는 단계를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.5. The method of claim 4,
The step of connecting the object node and the human body part node includes:
for each object node, connecting with the object node a set quantity of human body part nodes having the closest distance to the object node, respectively
Interaction relationship recognition method, characterized in that.

제4항 또는 제5항에 있어서,
인체와 물체의 공간 위치 관계도를 획득한 다음, 상기 방법은,
각 인체 부위 노드에 대하여, 상기 인체 부위 노드의 하나 또는 복수의 인접하는 인체 부위 노드의 특징 정보 및 상기 인체 부위 노드와 상기 인접하는 인체 부위 노드를 서로 연결한 에지의 특징 정보를 이용하여 각각의 상기 인체 부위 노드의 특징 정보를 업데이트하는 단계를 더 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.6. The method according to claim 4 or 5,
After obtaining the spatial positional relationship between the human body and the object, the method is
For each body part node, each of the body part nodes is obtained using characteristic information of one or a plurality of adjacent human body part nodes of the human body part node and characteristic information of an edge connecting the human body part node and the adjacent body part node to each other. Further comprising the step of updating characteristic information of the human body part node
Interaction relationship recognition method, characterized in that.

제1항 내지 제6항 중 어느 한 항에 있어서,
인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계는,
상기 인체 부위 노드의 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득하는 단계;
상기 물체 노드의 특징 정보에 기반하여 상기 물체에 대응하는 특징 정보를 획득하는 단계; 및
상기 인체에 대응하는 특징 정보 및 상기 물체에 대응하는 특징 정보에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.7. The method according to any one of claims 1 to 6,
The step of determining the interaction relationship between the human body and the object based on the spatial positional relationship between the human body and the object comprises:
obtaining characteristic information corresponding to the human body based on the characteristic information of the human body part node;
obtaining characteristic information corresponding to the object based on the characteristic information of the object node; and
determining an interaction relationship between the human body and the object based on the characteristic information corresponding to the human body and the characteristic information corresponding to the object
Interaction relationship recognition method, characterized in that.

제7항에 있어서,
상기 인체 부위 노드의 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득하는 단계는,
동일한 인체에 대하여, 각 인체 부위 노드의 특징 정보에 대해 글로벌 풀링 조작을 수행하여 상기 인체에 대응하는 특징 정보를 획득하는 단계를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.8. The method of claim 7,
The step of obtaining characteristic information corresponding to the human body based on the characteristic information of the human body part node includes:
For the same human body, performing a global pooling operation on the characteristic information of each human body part node to obtain characteristic information corresponding to the human body
Interaction relationship recognition method, characterized in that.

제1항 내지 제8항 중 어느 한 항에 있어서,
인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계는,
인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 단계를 포함하고;
상기 방법은,
상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 안전 계수가 제1 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 단계를 더 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.9. The method according to any one of claims 1 to 8,
The step of determining the interaction relationship between the human body and the object based on the spatial positional relationship between the human body and the object comprises:
determining an interaction category to which an interaction relationship between the human body and the object belongs based on a spatial positional relationship diagram between the human body and the object;
The method is
Further comprising the step of determining that the human body is placed in the target scene in response to the safety factor of the interaction category to which the interaction relationship between the human body and the object belongs is lower than a first set threshold value
Interaction relationship recognition method, characterized in that.

제1항 내지 제8항 중 어느 한 항에 있어서,
인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 단계는,
인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 단계를 포함하고;
상기 방법은,
상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 조합의 안전 계수를 결정하는 단계; 및
상기 조합의 안전 계수가 제2 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 단계를 더 포함하는
것을 특징으로 하는 인터랙션 관계 인식 방법.9. The method according to any one of claims 1 to 8,
The step of determining the interaction relationship between the human body and the object based on the spatial positional relationship between the human body and the object comprises:
determining an interaction category to which an interaction relationship between the human body and an object of a different category belongs based on a spatial positional relationship diagram between the human body and the object;
The method is
determining a safety factor of a combination of interaction categories to which an interaction relationship between the human body and an object of a different category belongs; and
determining that the human body is placed in the target scene in response to the safety factor of the combination being lower than a second set threshold value
Interaction relationship recognition method, characterized in that.

인터랙션 관계 인식 장치로서,
처리 대기 이미지를 검출하여 인체 검출 결과와 물체 검출 결과를 획득하기 위한 획득 유닛;
상기 인체 검출 결과에 기반하여 상기 처리 대기 이미지에서 인체에 대응하는 각 인체 부위 영역을 결정하고, 상기 물체 검출 결과에 기반하여 상기 처리 대기 이미지에서 물체에 대응하는 물체 영역을 결정하되, 여기서, 상기 물체 영역은 상기 물체 검출 결과 중의 물체 검출 박스에 대응하는 영역인 제1 결정 유닛;
상기 인체 부위 영역에 대응하는 인체 부위 노드 및 상기 물체 영역에 대응하는 물체 노드에 기반하여 인체와 물체의 공간 위치 관계도를 결정하되, 여기서, 상기 공간 위치 관계도는 각각의 상기 인체 부위 노드의 특징 정보, 상기 물체 노드의 특징 정보 및 각각의 상기 인체 부위 노드와 상기 물체 노드의 위치 관계 정보를 포함하는 제2 결정 유닛; 및
인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하기 위한 인식 유닛을 포함하는
것을 특징으로 하는 인터랙션 관계 인식 장치.An apparatus for recognizing an interaction relationship, comprising:
an acquiring unit for detecting the image to be processed to obtain a human body detection result and an object detection result;
Each body part region corresponding to the human body is determined in the processing standby image based on the human body detection result, and an object region corresponding to the object in the processing standby image is determined based on the object detection result, wherein the object a first determining unit, wherein the area is an area corresponding to the object detection box in the object detection result;
A spatial positional relationship diagram between the human body and the object is determined based on a human body part node corresponding to the human body part region and an object node corresponding to the object region, wherein the spatial positional relationship diagram is a feature of each human body part node a second determining unit including information, characteristic information of the object node, and positional relationship information of each of the human body part nodes and the object nodes; and
and a recognition unit for determining an interaction relationship between the human body and the object based on a spatial positional relationship diagram between the human body and the object.
Interaction relationship recognition device, characterized in that.

제11항에 있어서,
상기 제1 결정 유닛은,
상기 인체 검출 결과 중의 인체 검출 박스에 포함되는 특징 정보를 획득하고;
상기 특징 정보에 기반하여 상기 인체의 인체 키 포인트를 획득하며;
인체 뼈대 정보에 기반하여 상기 인체 키 포인트를 연결하여 연결 정보를 획득하고;
상기 인체 키 포인트 및 연결 정보에 기반하여 각 인체 부위 영역을 결정하는 것은, 서로 연결된 복수의 인체 키 포인트에 기반하여 하나의 인체 부위 영역을 결정하거나, 또는 상기 복수의 인체 키 포인트의 하나를 중심으로 하나의 인체 부위 영역을 결정하는 것 중 적어도 하나를 포함하는 데에 사용되는
것을 특징으로 하는 인터랙션 관계 인식 장치.12. The method of claim 11,
The first determining unit is
acquiring characteristic information included in a human body detection box in the human body detection result;
acquiring a key point of the human body based on the characteristic information;
connecting the key points of the human body based on the human skeleton information to obtain connection information;
Determining each body part region based on the human body key point and the connection information may include determining one body part region based on a plurality of human body key points connected to each other, or using one of the plurality of human body key points as a center. used to include at least one of determining an area of a body part
Interaction relationship recognition device, characterized in that.

제12항에 있어서,
상기 제2 결정 유닛은,
상기 인체 부위 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 인체 부위 노드의 특징 정보를 획득하고;
상기 물체 영역의 특징 정보에 대해 차원 축소를 수행하여 상기 물체 노드의 특징 정보를 획득하며;
동일한 인체에 대하여, 인체 뼈대 정보에 기반하여 각각의 상기 인체 부위 노드를 연결하고;
상기 물체 노드와 상기 인체 부위 노드를 연결시켜 인체와 물체의 공간 위치 관계도를 획득하는 것은, 각 물체 노드에 대하여, 상기 물체 노드와의 거리가 제일 가까운 설정 수량의 인체 부위 노드를 상기 물체 노드와 각각 연결하는 것을 포함하는 것에 사용되되, 하나의 물체 노드와 하나의 인체 부위 노드가 연결되어 형성된 에지의 특징 정보는 상기 에지가 연결한 물체 노드와 인체 부위 노드의 위치 관계 정보를 포함하는
것을 특징으로 하는 인터랙션 관계 인식 장치.13. The method of claim 12,
the second determining unit,
performing dimension reduction on the characteristic information of the human body region region to obtain characteristic information of the human body region node;
performing dimension reduction on the characteristic information of the object region to obtain characteristic information of the object node;
for the same human body, connecting each of the human body part nodes based on the human body skeleton information;
Connecting the object node and the human body part node to obtain a spatial positional relationship diagram between the human body and the object includes: for each object node, a set quantity of human body part nodes having the closest distance from the object node to the object node It is used to include connecting each, wherein the characteristic information of an edge formed by connecting one object node and one body part node includes positional relationship information between the object node and the human body part node connected by the edge.
Interaction relationship recognition device, characterized in that.

제13항에 있어서,
상기 인터랙션 관계 인식 장치는, 각 인체 부위 노드에 대하여, 상기 인체 부위 노드의 하나 또는 복수의 인접하는 인체 부위 노드의 특징 정보 및 상기 인체 부위 노드와 상기 인접하는 인체 부위 노드를 서로 연결한 에지의 특징 정보를 이용하여 각각의 상기 인체 부위 노드의 특징 정보를 업데이트하기 위한 업데이트 유닛을 더 포함하는
것을 특징으로 하는 인터랙션 관계 인식 장치.14. The method of claim 13,
The interaction relationship recognizing apparatus may include, for each human body part node, characteristic information of one or a plurality of adjacent human body part nodes of the human body part node, and characteristics of an edge connecting the human body part node and the adjacent body part node with each other. and an update unit for updating characteristic information of each of the human body part nodes by using the information.
Interaction relationship recognition device, characterized in that.

제11항 내지 제14항 중 어느 한 항에 있어서,
상기 인식 유닛은,
상기 인체 부위 노드의 특징 정보에 기반하여 상기 인체에 대응하는 특징 정보를 획득하는 것은, 동일한 인체에 대하여, 각 인체 부위 노드의 특징 정보에 대해 글로벌 풀링 조작을 수행하여 상기 인체에 대응하는 특징 정보를 획득하는 것을 포함하고;
상기 물체 노드의 특징 정보에 기반하여 상기 물체에 대응하는 특징 정보를 획득하며;
상기 인체에 대응하는 특징 정보 및 상기 물체에 대응하는 특징 정보에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계를 결정하는 것에 사용되는
것을 특징으로 하는 인터랙션 관계 인식 장치.15. The method according to any one of claims 11 to 14,
The recognition unit is
Acquiring the characteristic information corresponding to the human body based on the characteristic information of the human body part node includes performing a global pooling operation on the characteristic information of each human body part node for the same human body to obtain the characteristic information corresponding to the human body. including obtaining;
acquiring characteristic information corresponding to the object based on the characteristic information of the object node;
used to determine an interaction relationship between the human body and the object based on the characteristic information corresponding to the human body and the characteristic information corresponding to the object
Interaction relationship recognition device, characterized in that.

제11항 내지 제15항 중 어느 한 항에 있어서,
상기 인식 유닛은, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 데에 사용되고;
상기 인터랙션 관계 인식 장치는, 상기 인체와 상기 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 안전 계수가 제1 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하기 위한 제3 결정 유닛을 더 포함하는
것을 특징으로 하는 인터랙션 관계 인식 장치.16. The method according to any one of claims 11 to 15,
the recognition unit is used to determine an interaction category to which an interaction relationship between the human body and the object belongs, based on a spatial positional relationship diagram of the human body and the object;
The interaction relationship recognizing device includes a third determining unit, configured to determine that the human body is placed in the target scene in response to a safety factor of an interaction category to which the interaction relationship between the human body and the object belongs is lower than a first set threshold value; more containing
Interaction relationship recognition device, characterized in that.

제11항 내지 제15항 중 어느 한 항에 있어서,
상기 인식 유닛은, 인체와 물체의 공간 위치 관계도에 기반하여 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리를 결정하는 데에 사용되고;
상기 인터랙션 관계 인식 장치는, 상기 인체와 상이한 카테고리의 물체 사이의 인터랙션 관계가 속하는 인터랙션 카테고리의 조합의 안전 계수를 결정하고; 상기 조합의 안전 계수가 제2 설정 임계값보다 낮은 것에 응답하여 상기 인체가 타깃 장면에 처하는 것을 결정하는 제4 결정 유닛을 더 포함하는
것을 특징으로 하는 인터랙션 관계 인식 장치.16. The method according to any one of claims 11 to 15,
the recognition unit is used to determine an interaction category to which an interaction relationship between the human body and an object of a different category belongs, based on a spatial positional relationship diagram of the human body and the object;
the interaction relationship recognizing device is configured to determine a safety factor of a combination of interaction categories to which an interaction relationship between the human body and an object of a different category belongs; a fourth determining unit for determining that the human body is placed in the target scene in response to the safety factor of the combination being lower than a second set threshold value
Interaction relationship recognition device, characterized in that.

메모리, 프로세서를 포함하는 전자 기기로서,
상기 메모리는 프로세서에서 실행될 수 있는 컴퓨터 명령을 저장하고,
상기 프로세서는 상기 컴퓨터 명령을 실행할 경우, 제1항 내지 제10항 중 어느 한 항에 따른 방법을 구현하는
것을 특징으로 하는 전자 기기.An electronic device comprising a memory and a processor, comprising:
The memory stores computer instructions executable by the processor;
The processor, when executing the computer instructions, implements the method according to any one of claims 1 to 10
Electronic device, characterized in that.

컴퓨터 프로그램이 저장된 컴퓨터 판독 가능 저장 매체로서,
상기 프로그램이 프로세서에 의해 실행될 경우, 제1항 내지 제10항 중 어느 한 항에 따른 방법을 구현하는
것을 특징으로 하는 컴퓨터 판독 가능 저장 매체.A computer-readable storage medium having a computer program stored therein, comprising:
When the program is executed by a processor, it implements the method according to any one of claims 1 to 10
A computer-readable storage medium, characterized in that.

컴퓨터 프로그램으로서,
프로세서에 의해 실행될 경우, 제1항 내지 제10항 중 어느 한 항에 따른 방법을 구현하는
것을 특징으로 하는 컴퓨터 프로그램.A computer program comprising:
which, when executed by a processor, implements the method according to any one of claims 1 to 10
A computer program characterized in that.