WO2024067468A1

WO2024067468A1 - Interaction control method and apparatus based on image recognition, and device

Info

Publication number: WO2024067468A1
Application number: PCT/CN2023/121042
Authority: WO
Inventors: 许康太
Original assignee: 广州视琨电子科技有限公司
Priority date: 2022-09-27
Filing date: 2023-09-25
Publication date: 2024-04-04
Also published as: CN115576417A

Abstract

An interaction control method and apparatus based on image recognition, and a device, which relate to the Internet of Things technology. The interaction control method comprises: if it is determined that an electronic device is in a gesture image conversion mode, photographing an indication action of a user by means of a camera of the electronic device to obtain a plurality of frames of indication images (101); performing recognition processing on the plurality of frames of indication images to obtain a skeleton point image corresponding to the indication images, wherein the skeleton point image comprises indication skeleton information corresponding to indication actions in the indication images (102); and sending the skeleton point image to a display device, such that on the basis of a preset correspondence between skeleton point images and control instructions, the display device performs recognition processing on the skeleton point image, determines a target control instruction corresponding to the skeleton point image, and executes the target control instruction (103). The interaction control method makes a display device without a camera have the function and capability of air gesture control operations, thereby solving the problem of the man-machine interaction cost of a display device being relatively high.

Description

基于图像识别的交互控制方法、装置及设备Interactive control method, device and equipment based on image recognition

技术领域Technical Field

本申请涉及物联网技术，尤其涉及一种基于图像识别的交互控制方法、装置及设备。The present application relates to Internet of Things technology, and in particular to an interactive control method, device and equipment based on image recognition.

背景技术Background technique

目前，为了用户更方便地使用显示设备，高端的显示设备均配备摄像头用于人机交互。Currently, in order to make it more convenient for users to use display devices, high-end display devices are equipped with cameras for human-computer interaction.

现有技术中，显示设备中部署有人工智能的肢体识别算法，当用户在显示设备上设置的摄像头范围内作出定义的肢体动作，显示设备通过肢体识别算法将定义的肢体动作转化成控制指令，并执行控制指令，完成人机交互。In the prior art, an artificial intelligence body recognition algorithm is deployed in the display device. When a user makes a defined body movement within the range of a camera set on the display device, the display device converts the defined body movement into a control instruction through the body recognition algorithm and executes the control instruction to complete human-computer interaction.

然而现有技术中，由于显示设备需要设置摄像头硬件模块，还需要部署并运行肢体识别算法，对显示设备的处理器、内存等性能的要求较高，会提高显示设备的成本，导致现有的人机交互只能在较小范围中的高端的显示设备中实现，现有的人机交互的普及性较低。However, in the prior art, since the display device needs to be equipped with a camera hardware module and needs to deploy and run a body recognition algorithm, the requirements for the display device's processor, memory and other performance are high, which will increase the cost of the display device, resulting in the existing human-computer interaction can only be implemented in a smaller range of high-end display devices, and the existing human-computer interaction is less popular.

发明内容Summary of the invention

本申请提供一种基于图像识别的交互控制方法、装置及设备，用以解决显示设备的人机交互的成本较高的技术问题。The present application provides an interactive control method, device and equipment based on image recognition, which are used to solve the technical problem of high cost of human-computer interaction of display devices.

第一方面，本申请提供一种基于图像识别的交互控制方法，应用于电子设备，所述电子设备与显示设备通信连接；所述方法包括：In a first aspect, the present application provides an interactive control method based on image recognition, which is applied to an electronic device, wherein the electronic device is communicatively connected to a display device; the method comprises:

若确定所述电子设备处于手势图像转换模式下，则通过所述电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像；If it is determined that the electronic device is in the gesture image conversion mode, photographing the user's indicating action through a camera of the electronic device to obtain multiple frames of indicating images;

对多帧所述指示图像进行识别处理，得到与所述指示图像对应的骨骼点图；其中，所述骨骼点图包括与所述指示图像中的指示动作对应的指示骨骼信息；Performing recognition processing on multiple frames of the indication image to obtain a skeleton point map corresponding to the indication image; wherein the skeleton point map includes indication skeleton information corresponding to the indication action in the indication image;

将所述骨骼点图发送至所述显示设备，以使所述显示设备基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令，并执行所述目标控制指令。The skeleton point diagram is sent to the display device, so that the display device recognizes and processes the skeleton point diagram based on the preset correspondence between the skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction.

进一步地，对多帧所述指示图像进行识别处理，得到与所述指示图像对应的骨骼点图，包括：Further, performing recognition processing on multiple frames of the indication image to obtain a skeleton point map corresponding to the indication image includes:

对多帧指示图像进行识别处理，得到与每一帧指示图像对应的初始的骨骼点图；Performing recognition processing on multiple frames of indication images to obtain an initial skeleton point map corresponding to each frame of indication image;

若确定每一帧指示图像对应的初始的骨骼点图均相同，则确定初始的所述骨骼点图为与所述指示图像对应的骨骼点图；If it is determined that the initial skeleton point graphs corresponding to each frame of the indication image are the same, then the initial skeleton point graph is determined to be the skeleton point graph corresponding to the indication image;

若确定每一帧指示图像对应的初始的骨骼点图均不相同，则确定第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图均为与所述指示图像对应的骨骼点图。If it is determined that the initial skeleton point graphs corresponding to each frame of the indication image are different, it is determined that the initial skeleton point graph of the first frame of the indication image and the initial skeleton point graph of the last frame of the indication image are both skeleton point graphs corresponding to the indication image.

进一步地，所述骨骼点图对应的目标控制指令，是基于预设的骨骼点图与控制指令之间的对应关系，通过所述显示设备中已开启的骨骼点图识别模式对所述骨骼点图进行识别处理得到的。Furthermore, the target control instruction corresponding to the skeleton point diagram is obtained by identifying and processing the skeleton point diagram through a skeleton point diagram recognition mode enabled in the display device based on a preset corresponding relationship between the skeleton point diagram and the control instruction.

进一步地，将所述骨骼点图发送至所述显示设备，包括： Further, sending the skeleton point diagram to the display device includes:

若存在多个所述骨骼点图，则根据所述骨骼点图的生成时间，依次将所述骨骼点图发送至所述显示设备。If there are a plurality of the skeleton point graphs, the skeleton point graphs are sent to the display device in sequence according to the generation time of the skeleton point graphs.

进一步地，所述指示动作包括手势动作和/或肢体动作。Furthermore, the indicating action includes a gesture action and/or a body movement.

进一步地，所述电子设备与所述显示设备之间的连接方式包括有线连接和无线连接，其中，所述有线连接是电子设备的充电接口和显示设备的通用串行总线接口之间的线路连接；所述无线连接包括蓝牙通讯技术、局域网协议、近场通信、或广域网服务器。Furthermore, the connection method between the electronic device and the display device includes a wired connection and a wireless connection, wherein the wired connection is a line connection between a charging interface of the electronic device and a universal serial bus interface of the display device; the wireless connection includes Bluetooth communication technology, a local area network protocol, near field communication, or a wide area network server.

进一步地，所述方法还包括：Furthermore, the method further comprises:

响应于针对所述手势图像转换模式的选择操作，断开所述电子设备与所述显示设备之间的通信连接。In response to a selection operation for the gesture image conversion mode, the communication connection between the electronic device and the display device is disconnected.

第二方面，本申请提供一种基于图像识别的交互控制方法，应用于显示设备，所述显示设备与电子设备通信连接；所述方法包括：In a second aspect, the present application provides an interactive control method based on image recognition, which is applied to a display device, wherein the display device is communicatively connected with an electronic device; the method comprises:

接收所述电子设备发送的骨骼点图；其中，所述骨骼点图是对指示图像进行识别处理得到的，所述指示图像是在所述电子设备处于手势图像转换模式下，通过所述电子设备的摄像头对用户的指示动作进行拍摄得到的；Receiving a skeleton point diagram sent by the electronic device; wherein the skeleton point diagram is obtained by performing recognition processing on an indication image, and the indication image is obtained by photographing an indication action of a user through a camera of the electronic device when the electronic device is in a gesture image conversion mode;

基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令，并执行所述目标控制指令。Based on the correspondence between the preset skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed, the target control instruction corresponding to the skeleton point diagram is determined, and the target control instruction is executed.

进一步地，对所述骨骼点图进行识别处理，包括：Further, the skeleton point graph is identified and processed, including:

基于预设的骨骼点图与控制指令之间的对应关系，通过所述显示设备中已开启的骨骼点图识别模式对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令。Based on the correspondence between the preset skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed by the skeleton point diagram identification mode turned on in the display device to determine the target control instruction corresponding to the skeleton point diagram.

进一步地，与所述指示图像对应的骨骼点图是根据第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图确定的，或者，与所述指示图像对应的骨骼点图是根据初始的所述骨骼点图确定的，其中，初始的所述骨骼点图是对每一帧所述指示图像进行识别处理得到的。Furthermore, the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map of the first frame indication image and the initial skeleton point map of the last frame indication image, or the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map, wherein the initial skeleton point map is obtained by performing recognition processing on each frame of the indication image.

进一步地，所述骨骼点图的发送顺序是根据所述骨骼点图的生成时间确定的。Furthermore, the sending order of the skeleton point graph is determined according to the generation time of the skeleton point graph.

进一步地，所述电子设备与所述显示设备之间的连接方式包括有线连接和无线连接，其中，所述有线连接是电子设备的充电接口和显示设备的通用串行总线接口确定的；所述无线连接包括蓝牙无线技术、局域网协议、近场通信、或广域网服务器。Furthermore, the connection method between the electronic device and the display device includes a wired connection and a wireless connection, wherein the wired connection is determined by a charging interface of the electronic device and a universal serial bus interface of the display device; the wireless connection includes Bluetooth wireless technology, a local area network protocol, near field communication, or a wide area network server.

进一步地，所述电子设备与所述显示设备之间的通信连接，是根据针对所述手势图像转换模式的选择操作断开的。Furthermore, the communication connection between the electronic device and the display device is disconnected according to the selection operation for the gesture image conversion mode.

第三方面，本申请提供一种基于图像识别的交互控制装置，应用于电子设备，所述电子设备与显示设备通信连接；所述装置包括：In a third aspect, the present application provides an interactive control device based on image recognition, which is applied to an electronic device, wherein the electronic device is communicatively connected to a display device; the device comprises:

拍摄单元，用于若确定所述电子设备处于手势图像转换模式下，则通过所述电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像；a photographing unit, configured to photograph the user's indicating action through a camera of the electronic device to obtain multiple frames of indicating images if it is determined that the electronic device is in a gesture image conversion mode;

识别单元，用于对多帧所述指示图像进行识别处理，得到与所述指示图像对应的骨骼点图；其中，所述骨骼点图包括与所述指示图像中的指示动作对应的指示骨骼信息；A recognition unit, used for performing recognition processing on multiple frames of the indication image to obtain a skeleton point map corresponding to the indication image; wherein the skeleton point map includes indication skeleton information corresponding to the indication action in the indication image;

发送单元，用于将所述骨骼点图发送至所述显示设备，以使所述显示设备基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令，并执行所述目标控制指令。A sending unit is used to send the skeleton point map to the display device so that the display device can display the skeleton point map based on the preset skeleton point map. The corresponding relationship between the skeleton point diagram and the control instruction is determined, the skeleton point diagram is identified, the target control instruction corresponding to the skeleton point diagram is determined, and the target control instruction is executed.

进一步地，所述识别单元，包括：Furthermore, the identification unit includes:

识别模块，用于对多帧指示图像进行识别处理，得到与每一帧指示图像对应的初始的骨骼点图；The recognition module is used to perform recognition processing on multiple frames of indication images to obtain an initial skeleton point map corresponding to each frame of indication image;

第一确定模块，用于若确定每一帧指示图像对应的初始的骨骼点图均相同，则确定初始的所述骨骼点图为与所述指示图像对应的骨骼点图；A first determination module, configured to determine that the initial skeleton point diagram is the skeleton point diagram corresponding to the indication image if it is determined that the initial skeleton point diagrams corresponding to each frame of indication image are the same;

第二确定模块，用于若确定每一帧指示图像对应的初始的骨骼点图均不相同，则确定第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图均为与所述指示图像对应的骨骼点图。The second determination module is used to determine that the initial skeleton point map of the first frame of the indication image and the initial skeleton point map of the last frame of the indication image are both skeleton point maps corresponding to the indication image if it is determined that the initial skeleton point map corresponding to each frame of the indication image is different.

进一步地，所述发送单元，具体用于：Furthermore, the sending unit is specifically used for:

进一步地，所述装置还包括：Furthermore, the device also includes:

断开单元，用于响应于针对所述手势图像转换模式的选择操作，断开所述电子设备与所述显示设备之间的通信连接。A disconnection unit is used to disconnect the communication connection between the electronic device and the display device in response to a selection operation for the gesture image conversion mode.

第四方面，本申请提供一种基于图像识别的交互控制装置，应用于显示设备，所述显示设备与电子设备通信连接；所述装置包括：In a fourth aspect, the present application provides an interactive control device based on image recognition, which is applied to a display device, wherein the display device is communicatively connected to an electronic device; the device comprises:

接收单元，用于接收所述电子设备发送的骨骼点图；其中，所述骨骼点图是对指示图像进行识别处理得到的，所述指示图像是在所述电子设备处于手势图像转换模式下，通过所述电子设备的摄像头对用户的指示动作进行拍摄得到的；A receiving unit, configured to receive a skeleton point diagram sent by the electronic device; wherein the skeleton point diagram is obtained by performing recognition processing on an indication image, and the indication image is obtained by photographing an indication action of a user through a camera of the electronic device when the electronic device is in a gesture image conversion mode;

确定单元，用于基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令；A determination unit, configured to perform recognition processing on the skeleton point diagram based on a preset correspondence relationship between the skeleton point diagram and the control instruction, and determine a target control instruction corresponding to the skeleton point diagram;

执行单元，用于执行所述目标控制指令。An execution unit is used to execute the target control instruction.

进一步地，所述确定单元，具体用于：Furthermore, the determining unit is specifically configured to:

进一步地，与所述指示图像对应的骨骼点图是根据第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图确定的，或者，与所述指示图像对应的骨骼点图是根据初始的所述骨骼点图确定的，其中，初始的所述骨骼点图是对每一帧所述指示图像进行识别处理得到的。Further, the skeleton point diagram corresponding to the indication image is determined based on the initial skeleton point diagram of the first frame indication image and the initial skeleton point diagram of the last frame indication image, or the skeleton point diagram corresponding to the indication image is determined based on the initial skeleton point diagram, wherein the initial skeleton point diagram is a diagram for identifying each frame of the indication image. Can be processed.

第五方面，本申请提供一种电子设备，包括存储器、处理器，所述存储器中存储有可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现第一方面所述的方法。In a fifth aspect, the present application provides an electronic device, comprising a memory and a processor, wherein the memory stores a computer program that can be executed on the processor, and when the processor executes the computer program, the method described in the first aspect is implemented.

第六方面，本申请提供一种显示设备，包括存储器、处理器，所述存储器中存储有可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现第二方面所述的方法。In a sixth aspect, the present application provides a display device, comprising a memory and a processor, wherein the memory stores a computer program that can be executed on the processor, and when the processor executes the computer program, the method described in the second aspect is implemented.

第七方面，本申请提供一种计算机可读存储介质，所述计算机可读存储介质中存储有计算机执行指令，所述计算机执行指令被处理器执行时用于实现第一方面所述的方法，或者，实现第二方面所述的方法。In a seventh aspect, the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, they are used to implement the method described in the first aspect, or to implement the method described in the second aspect.

第八方面，本申请提供一种计算机程序产品，包括计算机程序，该计算机程序被处理器执行时实现第一方面所述的方法，或者，实现第二方面所述的方法。In an eighth aspect, the present application provides a computer program product, including a computer program, which, when executed by a processor, implements the method described in the first aspect, or implements the method described in the second aspect.

本申请提供的一种基于图像识别的交互控制方法、装置及设备，若确定电子设备处于手势图像转换模式下，则通过电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像。对多帧指示图像进行识别处理，得到与指示图像对应的骨骼点图；其中，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。将骨骼点图发送至显示设备，以使显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令。本方案中，在电子设备处于手势图像转换模式下，可以开启摄像头进行工作，如果用户在摄像头的范围内作出指示动作，摄像头会自动对指示动作进行拍摄，得到多帧指示图像。电子设备对多帧指示图像进行识别处理，得到与指示图像对应的骨骼点图，其中，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。最后，将骨骼点图发送至显示设备，显示设备接收到电子设备发送的骨骼点图时，显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令，此时显示设备可以作出与目标控制指令对应的操作，完成人、电子设备和显示设备之间的交互。The present application provides an interactive control method, device and equipment based on image recognition. If it is determined that the electronic device is in a gesture image conversion mode, the user's indication action is photographed through the camera of the electronic device to obtain multiple frames of indication images. The multiple frames of indication images are identified and processed to obtain a skeleton point diagram corresponding to the indication image; wherein the skeleton point diagram includes the indication skeleton information corresponding to the indication action in the indication image. The skeleton point diagram is sent to the display device so that the display device identifies and processes the skeleton point diagram based on the correspondence between the preset skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction. In this solution, when the electronic device is in the gesture image conversion mode, the camera can be turned on to work. If the user makes an indication action within the range of the camera, the camera will automatically shoot the indication action to obtain multiple frames of indication images. The electronic device identifies and processes the multiple frames of indication images to obtain a skeleton point diagram corresponding to the indication image, wherein the skeleton point diagram includes the indication skeleton information corresponding to the indication action in the indication image. Finally, the skeleton point diagram is sent to the display device. When the display device receives the skeleton point diagram sent by the electronic device, the display device identifies and processes the skeleton point diagram based on the correspondence between the preset skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction. At this time, the display device can make an operation corresponding to the target control instruction to complete the interaction between the person, the electronic device and the display device.

所以，通过电子设备的摄像头得到骨骼点图，并将骨骼点图发送给显示设备的交互过程，充分的利用了电子设备的摄像头的拍摄能力和处理器运算能力的优势，完成指示动作的图像转化，转化为显示设备可以识别的骨骼点图，节省了显示设备的摄像头成本和处理器、内存性能成本，极大的节省了资源，极大的降低了显示设备的成本，让自身不带摄像头的显示设备均具有了隔空手势控制操作的功能和能力，解决了显示设备的人机交互的成本较高的技术问题。Therefore, the interactive process of obtaining the skeleton point map through the camera of the electronic device and sending the skeleton point map to the display device fully utilizes the advantages of the camera shooting ability and processor computing power of the electronic device to complete the image conversion of the indicated action into a skeleton point map that can be recognized by the display device, saving the camera cost and processor and memory performance cost of the display device, greatly saving resources, greatly reducing the cost of the display device, and allowing display devices without cameras to have the function and ability of air gesture control operation, solving the high cost of human-computer interaction of display devices. question.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

此处的附图被并入说明书中并构成本说明书的一部分，示出了符合本公开的实施例，并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the present disclosure.

图1为本申请实施例提供的一种基于图像识别的交互控制方法的流程示意图；FIG1 is a schematic diagram of a flow chart of an interactive control method based on image recognition provided by an embodiment of the present application;

图2为本申请实施例提供的一种指示动作的场景示意图；FIG2 is a schematic diagram of a scenario of indicating an action provided in an embodiment of the present application;

图3为本申请实施例提供的一种手势动作的场景示意图；FIG3 is a schematic diagram of a scene of a gesture action provided in an embodiment of the present application;

图4为本申请实施例提供的一种骨骼点图的场景示意图；FIG4 is a schematic diagram of a scene of a skeleton point diagram provided in an embodiment of the present application;

图5为本申请实施例提供的另一种基于图像识别的交互控制方法的流程示意图；FIG5 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application;

图6为本申请实施例提供的又一种基于图像识别的交互控制方法的流程示意图；FIG6 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application;

图7为本申请实施例提供的又一种基于图像识别的交互控制方法的流程示意图；FIG7 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application;

图8为本申请实施例提供的再一种基于图像识别的交互控制方法的流程示意图；FIG8 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application;

图9为本申请实施例提供的其他一种基于图像识别的交互控制方法的流程示意图；FIG9 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application;

图10为本申请实施例提供的一种基于图像识别的交互控制装置的结构示意图；FIG10 is a schematic diagram of the structure of an interactive control device based on image recognition provided in an embodiment of the present application;

图11为本申请实施例提供的另一种基于图像识别的交互控制装置的结构示意图；FIG11 is a schematic diagram of the structure of another interactive control device based on image recognition provided in an embodiment of the present application;

图12为本申请实施例提供的又一种基于图像识别的交互控制装置的结构示意图；FIG12 is a schematic diagram of the structure of another interactive control device based on image recognition provided in an embodiment of the present application;

图13为本申请实施例提供的一种电子设备的结构示意图；FIG13 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application;

图14为本申请实施例提供的一种显示设备的结构示意图；FIG14 is a schematic diagram of the structure of a display device provided in an embodiment of the present application;

图15为本申请实施例提供的一种电子设备的框图。FIG15 is a block diagram of an electronic device provided in an embodiment of the present application.

通过上述附图，已示出本公开明确的实施例，后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本公开构思的范围，而是通过参考特定实施例为本领域技术人员说明本公开的概念。The above drawings have shown clear embodiments of the present disclosure, which will be described in more detail below. These drawings and text descriptions are not intended to limit the scope of the present disclosure in any way, but to illustrate the concepts of the present disclosure to those skilled in the art by referring to specific embodiments.

具体实施方式Detailed ways

这里将详细地对示例性实施例进行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。Here, exemplary embodiments will be described in detail, examples of which are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the present disclosure.

目前，为了用户更方便地使用显示设备，高端的显示设备均配备摄像头用于人机交互，但在人机交互过程中，存在诸多难题。At present, in order to make it more convenient for users to use display devices, high-end display devices are equipped with cameras for human-computer interaction. However, there are many difficulties in the human-computer interaction process.

一个示例中，显示设备中部署有人工智能的肢体识别算法，当用户在显示设备上设置的摄像头范围内作出定义的肢体动作，显示设备通过肢体识别算法将定义的肢体动作转化成控制指令，并执行控制指令，完成人机交互。然而现有技术中，由于显示设备需要设置摄像头硬件模块，还需要部署并运行肢体识别算法，对显示设备的处理器、内存等性能的要求较高，会提高显示设备的成本，导致现有的人机交互只能在较小范围中的高端的显示设备中实现，现有的人机交互的普及性较低。In one example, an artificial intelligence body recognition algorithm is deployed in the display device. When the user makes a defined body movement within the camera range set on the display device, the display device converts the defined body movement into a control instruction through the body recognition algorithm, and executes the control instruction to complete the human-computer interaction. However, in the prior art, since the display device needs to set up a camera hardware module and deploy and run a body recognition algorithm, the requirements for the processor, memory and other performance of the display device are relatively high, which will increase the cost of the display device, resulting in the existing human-computer interaction can only be implemented in a small range of high-end display devices, and the existing human-computer interaction is less popular.

一个示例中，若要做到很好的识别率和操控准确度，对摄像头拍摄的画面的画质、分辨率、帧率等也有较高的要求，所以，摄像头硬件模块成本也会较高。In an example, if you want to achieve a good recognition rate and control accuracy, the image quality and resolution of the images captured by the camera There are also high requirements for the rate, frame rate, etc., so the cost of the camera hardware module will also be high.

一个示例中，若是显示设备自带摄像头模块，特别是位于家庭环境中的显示设备，用户会对隐私和安全有顾虑。In one example, if a display device has its own camera module, especially a display device located in a home environment, users may have concerns about privacy and security.

本申请提供的一种基于图像识别的交互控制方法、装置及设备，旨在解决现有技术的如上技术问题。The present application provides an interactive control method, device and equipment based on image recognition, aiming to solve the above technical problems in the prior art.

下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合，对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图，对本申请的实施例进行描述。The technical solution of the present application and how the technical solution of the present application solves the above-mentioned technical problems are described in detail below with specific embodiments. The following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. The embodiments of the present application will be described below in conjunction with the accompanying drawings.

图1为本申请实施例提供的一种基于图像识别的交互控制方法的流程示意图，应用于电子设备，电子设备与显示设备相连接；如图1所示，该方法包括：FIG1 is a flow chart of an interactive control method based on image recognition provided by an embodiment of the present application, which is applied to an electronic device, where the electronic device is connected to a display device; as shown in FIG1 , the method includes:

101、若确定电子设备处于手势图像转换模式下，则通过电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像。101. If it is determined that the electronic device is in a gesture image conversion mode, photograph the user's indicating action through a camera of the electronic device to obtain multiple frames of indicating images.

示例性地，本实施例的执行主体可以为电子设备、或者终端设备、或者基于图像识别的交互控制装置或设备、或者其他可以执行本实施例的装置或设备，对此不做限制。本实施例中以执行主体为电子设备进行介绍。For example, the execution subject of this embodiment may be an electronic device, or a terminal device, or an interactive control device or device based on image recognition, or other devices or equipment that can execute this embodiment, and there is no limitation on this. In this embodiment, the execution subject is introduced as an electronic device.

首先，手势图像转换模式用于对指示图像进行手势图像转换，电子设备与显示设备通信连接，其中，通信连接可以是有线连接也可以是无线连接。有线连接可通过电子设备的充电接口与显示设备的通用串行总线(UniversalSerialBus，USB)接口连接；无线连接包含但不仅限于通过局域网协议、蓝牙通讯技术、近场通信(NearFieldCommunication，NFC)或者广域网服务器等方式建立连接。当用户开启手势图像转换模式，电子设备处于手势图像转换模式下，同时手势图像转换模式请求电子设备的摄像头的图像数据，显示设备也需要处于骨骼点图识别模式下。其中，电子设备的手势图像转换模式的开启关闭，是根据用户针对该应用APP的选择操作控制实现的，显示设备的骨骼点图识别模式可以是显示设备启动后一直保持在后台自动运行的，手势图像转换模式和骨骼点图识别模式可以是出厂预先部署或后部署，均不作限制。指示动作是用户按照预先定义的动作指令库做出的有意义的动作，举例来说，指示动作包括手势动作和肢体动作，图2为本申请实施例提供的一种指示动作的场景示意图，如图2所示，手势动作包括唤醒手势、确认键、方向上键…等，肢体动作包括摇头、双手高举、双手平举等。指示动作包括静态动作和动态动作，静态动作如图2中的各种稳定持续的手势动作或肢体动作，此类动作特征信息一般采用单帧数据获取或者多帧数据均相同时根据任一帧数据确定。动态动作如手从A点运动到B点等，此类动作特征信息通常通过多帧数据获取。First, the gesture image conversion mode is used to convert the indicated image into a gesture image, and the electronic device is connected to the display device in communication, wherein the communication connection can be a wired connection or a wireless connection. The wired connection can be connected to the Universal Serial Bus (USB) interface of the display device through the charging interface of the electronic device; the wireless connection includes but is not limited to establishing a connection through a local area network protocol, Bluetooth communication technology, near field communication (NFC) or a wide area network server. When the user turns on the gesture image conversion mode, the electronic device is in the gesture image conversion mode, and the gesture image conversion mode requests the image data of the camera of the electronic device, and the display device also needs to be in the skeleton point map recognition mode. Among them, the opening and closing of the gesture image conversion mode of the electronic device is realized according to the user's selection operation control for the application APP, and the skeleton point map recognition mode of the display device can be kept in the background automatically after the display device is started. The gesture image conversion mode and the skeleton point map recognition mode can be pre-deployed or post-deployed at the factory, without restriction. An indication action is a meaningful action made by a user according to a predefined action instruction library. For example, an indication action includes a gesture action and a body movement. FIG2 is a schematic diagram of a scene of an indication action provided in an embodiment of the present application. As shown in FIG2 , gesture actions include a wake-up gesture, a confirmation key, a direction key, etc., and body movements include shaking the head, raising both hands high, raising both hands horizontally, etc. Indication actions include static actions and dynamic actions. Static actions include various stable and continuous gesture actions or body movements in FIG2 . Such action feature information is generally acquired using a single frame of data or determined based on any frame of data when multiple frames of data are the same. Dynamic actions include the movement of the hand from point A to point B, etc. Such action feature information is usually acquired through multiple frames of data.

在该步骤中，如果确定电子设备处于手势图像转换模式下，则当用户在电子设备的摄像头的拍摄范围内作出指示动作时，通过电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像。In this step, if it is determined that the electronic device is in the gesture image conversion mode, when the user makes an indicating action within the shooting range of the camera of the electronic device, the camera of the electronic device shoots the user's indicating action to obtain multiple frames of indicating images.

102、对多帧指示图像进行识别处理，得到与指示图像对应的骨骼点图；其中，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。102. Perform recognition processing on multiple frames of indication images to obtain a skeleton point map corresponding to the indication images; wherein the skeleton point map includes indication skeleton information corresponding to the indication action in the indication image.

示例性地，图3为本申请实施例提供的一种手势动作的场景示意图，如图3所示，图3 显示的是包含五指伸展的手掌的指示图像，图4为本申请实施例提供的一种骨骼点图的场景示意图，如图4所示，图4显示的是与图3中五指伸展的手掌对应的骨骼点图。如果指示动作为手势动作，电子设备在手势图像转换模式下对多帧指示图像中每一帧指示图像中作出手势动作的手部进行识别处理，转换成包含手部指骨空间分布的骨骼点图，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。或者，如果指示动作为肢体动作，电子设备对多帧指示图像中每一帧指示图像中作出肢体动作的肢体进行识别处理，转换成包含肢体指骨空间分布的骨骼点图，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。For example, FIG3 is a schematic diagram of a scene of a gesture action provided in an embodiment of the present application, as shown in FIG3, FIG3 What is displayed is an indication image of a palm with five fingers extended. FIG4 is a scene schematic diagram of a skeleton point diagram provided by an embodiment of the present application. As shown in FIG4 , FIG4 shows a skeleton point diagram corresponding to the palm with five fingers extended in FIG3 . If the indication action is a gesture action, the electronic device performs recognition processing on the hand making the gesture action in each indication image in the multiple indication images in the gesture image conversion mode, and converts it into a skeleton point diagram containing the spatial distribution of the phalanges of the hand, and the skeleton point diagram includes the indication skeleton information corresponding to the indication action in the indication image. Alternatively, if the indication action is a limb action, the electronic device performs recognition processing on the limb making the limb action in each indication image in the multiple indication images, and converts it into a skeleton point diagram containing the spatial distribution of the phalanges of the limb, and the skeleton point diagram includes the indication skeleton information corresponding to the indication action in the indication image.

103、将骨骼点图发送至显示设备，以使显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令。103. Send the skeleton point diagram to the display device, so that the display device recognizes and processes the skeleton point diagram based on a preset correspondence between the skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction.

示例性地，通过电子设备与显示设备之间建立的稳定的通信连接，电子设备将实时转换好的骨骼点图实时发送至显示设备，当显示设备接收到骨骼点图时，基于预设的骨骼点图与控制指令之间的对应关系，显示设备通过已开启的骨骼点图识别模式对骨骼点图进行识别处理，通过预设的骨骼点图与控制指令之间的对应关系，确定骨骼点图对应的目标控制指令，显示设备将目标控制指令实时发送给显示设备的控制***，显示设备通过控制***执行目标控制指令。Exemplarily, through a stable communication connection established between the electronic device and the display device, the electronic device sends the real-time converted skeleton point diagram to the display device in real time. When the display device receives the skeleton point diagram, based on the correspondence between the preset skeleton point diagram and the control instruction, the display device identifies and processes the skeleton point diagram through the enabled skeleton point diagram recognition mode, determines the target control instruction corresponding to the skeleton point diagram through the correspondence between the preset skeleton point diagram and the control instruction, and the display device sends the target control instruction to the control system of the display device in real time, and the display device executes the target control instruction through the control system.

本申请实施例中，若确定电子设备处于手势图像转换模式下，则通过电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像。对多帧指示图像进行识别处理，得到与指示图像对应的骨骼点图；其中，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。将骨骼点图发送至显示设备，以使显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令。本方案中，在电子设备处于手势图像转换模式下，可以开启摄像头进行工作，如果用户在摄像头的范围内作出指示动作，摄像头会自动对指示动作进行拍摄，得到多帧指示图像。电子设备对多帧指示图像进行识别处理，得到与指示图像对应的骨骼点图，其中，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。最后，将骨骼点图发送至显示设备，显示设备接收到电子设备发送的骨骼点图时，显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令，此时显示设备可以作出与目标控制指令对应的操作，完成人、电子设备和显示设备之间的交互。所以，通过电子设备的摄像头得到骨骼点图，并将骨骼点图发送给显示设备的交互过程，充分的利用了电子设备的摄像头的拍摄能力和处理器运算能力的优势，完成指示动作的图像转化，转化为显示设备可以识别的骨骼点图，节省了显示设备的摄像头成本和处理器、内存性能成本，极大的节省了资源，极大的降低了显示设备的成本，让自身不带摄像头的显示设备均具有了隔空手势控制操作的功能和能力，解决了显示设备的人机交互的成本较高的技术问题。In an embodiment of the present application, if it is determined that the electronic device is in a gesture image conversion mode, the user's indicating action is photographed through the camera of the electronic device to obtain multiple frames of indicating images. The multiple frames of indicating images are identified and processed to obtain a skeleton point diagram corresponding to the indicating image; wherein the skeleton point diagram includes indicating skeleton information corresponding to the indicating action in the indicating image. The skeleton point diagram is sent to a display device so that the display device identifies and processes the skeleton point diagram based on the correspondence between the preset skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction. In this scheme, when the electronic device is in a gesture image conversion mode, the camera can be turned on to work. If the user makes an indicating action within the range of the camera, the camera will automatically photograph the indicating action to obtain multiple frames of indicating images. The electronic device identifies and processes the multiple frames of indicating images to obtain a skeleton point diagram corresponding to the indicating image, wherein the skeleton point diagram includes indicating skeleton information corresponding to the indicating action in the indicating image. Finally, the skeleton point diagram is sent to the display device. When the display device receives the skeleton point diagram sent by the electronic device, the display device identifies and processes the skeleton point diagram based on the correspondence between the preset skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction. At this time, the display device can perform operations corresponding to the target control instruction to complete the interaction between the person, the electronic device and the display device. Therefore, the interactive process of obtaining the skeleton point diagram through the camera of the electronic device and sending the skeleton point diagram to the display device fully utilizes the advantages of the camera shooting ability and processor computing ability of the electronic device to complete the image conversion of the indicated action and convert it into a skeleton point diagram that can be recognized by the display device, saving the camera cost of the display device and the processor and memory performance cost, greatly saving resources, and greatly reducing the cost of the display device, so that the display device without a camera itself has the function and ability of air gesture control operation, solving the technical problem of the high cost of human-computer interaction of the display device.

图5为本申请实施例提供的另一种基于图像识别的交互控制方法的流程示意图，应用于电子设备电子设备与显示设备通信连接；如图5所示，该方法包括：FIG5 is a flow chart of another interactive control method based on image recognition provided by an embodiment of the present application, which is applied to an electronic device and a display device for communication connection; as shown in FIG5 , the method includes:

201、若确定电子设备处于手势图像转换模式下，则通过电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像。201. If it is determined that the electronic device is in the gesture image conversion mode, the user's gesture image conversion mode is detected by the camera of the electronic device. The indicating action is shot to obtain multiple frames of indicating images.

一个示例中，指示动作包括手势动作和/或肢体动作。In one example, the indicating action includes a gesture action and/or a body movement.

一个示例中，电子设备与显示设备之间的连接方式包括有线连接和无线连接，其中，有线连接是电子设备的充电接口和显示设备的通用串行总线接口之间的线路连接；无线连接包括蓝牙通讯技术、局域网协议、近场通信、或广域网服务器。In one example, the connection method between the electronic device and the display device includes wired connection and wireless connection, wherein the wired connection is a line connection between the charging interface of the electronic device and the universal serial bus interface of the display device; the wireless connection includes Bluetooth communication technology, local area network protocol, near field communication, or wide area network server.

示例性地，本步骤可以参见图1中的步骤101，不再赘述。Exemplarily, this step may refer to step 101 in FIG. 1 , and will not be described in detail.

202、对多帧指示图像进行识别处理，得到与每一帧指示图像对应的初始的骨骼点图。202. Perform recognition processing on multiple frames of indication images to obtain an initial skeleton point map corresponding to each frame of indication image.

示例性地，电子设备对多帧指示图像进行识别处理，得到与每一帧指示图像对应的初始的骨骼点图。Exemplarily, the electronic device performs recognition processing on multiple frames of indication images to obtain an initial skeleton point map corresponding to each frame of indication image.

203、若确定每一帧指示图像对应的初始的骨骼点图均相同，则确定初始的骨骼点图为与指示图像对应的骨骼点图。203. If it is determined that the initial skeleton point graphs corresponding to each frame of the indication image are the same, then the initial skeleton point graph is determined to be the skeleton point graph corresponding to the indication image.

示例性地，电子设备将每一帧指示图像对应的初始的骨骼点图进行比较，如果确定每一帧指示图像对应的初始的骨骼点图均相同，说明指示动作为静态动作，则确定初始的骨骼点图为与指示图像对应的骨骼点图。Exemplarily, the electronic device compares the initial skeleton point diagram corresponding to each frame of indication image. If it is determined that the initial skeleton point diagram corresponding to each frame of indication image is the same, it means that the indication action is a static action, then the initial skeleton point diagram is determined to be the skeleton point diagram corresponding to the indication image.

204、若确定每一帧指示图像对应的初始的骨骼点图均不相同，则确定第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图均为与指示图像对应的骨骼点图。204. If it is determined that the initial skeleton point graphs corresponding to each frame of the indication image are different, determine that the initial skeleton point graph of the first frame of the indication image and the initial skeleton point graph of the last frame of the indication image are both skeleton point graphs corresponding to the indication image.

示例性地，电子设备将每一帧指示图像对应的初始的骨骼点图进行比较，如果确定每一帧指示图像对应的初始的骨骼点图均不相同，说明指示动作为动态动作，则确定第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图均为与指示图像对应的骨骼点图。Exemplarily, the electronic device compares the initial skeleton point map corresponding to each frame of indication image. If it is determined that the initial skeleton point map corresponding to each frame of indication image is different, it means that the indication action is a dynamic action. Then, it is determined that the initial skeleton point map of the first frame of indication image and the initial skeleton point map of the last frame of indication image are both skeleton point maps corresponding to the indication image.

举例来说，指示动作为手从A点运动到B点，则确定包含手的开始位置A点的第一帧指示图像的初始的骨骼点图、包含手的结束位置B点的最后一帧指示图像的初始的骨骼点图均为与指示图像对应的骨骼点图。For example, if the indication action is the hand moving from point A to point B, then the initial skeleton point map of the first frame indication image containing the starting position A of the hand and the initial skeleton point map of the last frame indication image containing the ending position B of the hand are both skeleton point maps corresponding to the indication image.

205、若存在多个骨骼点图，则根据骨骼点图的生成时间，依次将骨骼点图发送至显示设备，以使显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令。205. If there are multiple skeleton point diagrams, the skeleton point diagrams are sent to the display device in sequence according to the generation time of the skeleton point diagrams, so that the display device recognizes and processes the skeleton point diagrams based on the correspondence between the preset skeleton point diagrams and the control instructions, determines the target control instructions corresponding to the skeleton point diagrams, and executes the target control instructions.

一个示例中，骨骼点图对应的目标控制指令，是基于预设的骨骼点图与控制指令之间的对应关系，通过显示设备中已开启的骨骼点图识别模式对骨骼点图进行识别处理得到的。In one example, the target control instruction corresponding to the skeleton point diagram is obtained by identifying and processing the skeleton point diagram through a skeleton point diagram recognition mode enabled in a display device based on a preset correspondence between the skeleton point diagram and the control instruction.

示例性地，如果存在多个骨骼点图，则根据骨骼点图的生成时间，按照时间先后顺序依次实时发送给显示设备，即先生成的骨骼点图先发送。显示设备接收到骨骼点图时，基于预设的骨骼点图与控制指令之间的对应关系，显示设备通过后台运行的骨骼点图识别模式对骨骼点图进行识别处理，通过预设的骨骼点图与控制指令之间的对应关系，确定骨骼点图对应的目标控制指令，并执行目标控制指令。Exemplarily, if there are multiple skeleton point diagrams, they are sent to the display device in real time in chronological order according to the generation time of the skeleton point diagrams, that is, the skeleton point diagram generated first is sent first. When the display device receives the skeleton point diagram, based on the correspondence between the preset skeleton point diagram and the control instruction, the display device identifies and processes the skeleton point diagram through the skeleton point diagram recognition mode running in the background, determines the target control instruction corresponding to the skeleton point diagram through the correspondence between the preset skeleton point diagram and the control instruction, and executes the target control instruction.

206、响应于针对手势图像转换模式的选择操作，断开电子设备与显示设备之间的通信连接。206. In response to the selection operation for the gesture image conversion mode, disconnect the communication connection between the electronic device and the display device.

示例性地，当用户针对手势图像转换模式作出选择操作时，断开电子设备与显示设备之间的通信连接，其中，选择操作可以为单击、双击等，对此不作限定。 Exemplarily, when the user makes a selection operation for the gesture image conversion mode, the communication connection between the electronic device and the display device is disconnected, wherein the selection operation may be a single click, a double click, etc., which is not limited.

本申请实施例中，若确定电子设备处于手势图像转换模式下，则通过电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像。对多帧指示图像进行识别处理，得到与每一帧指示图像对应的初始的骨骼点图。若确定每一帧指示图像对应的初始的骨骼点图均相同，则确定初始的骨骼点图为与指示图像对应的骨骼点图。若确定每一帧指示图像对应的初始的骨骼点图均不相同，则确定第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图均为与指示图像对应的骨骼点图。若存在多个骨骼点图，则根据骨骼点图的生成时间，依次将骨骼点图发送至显示设备，以使显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令。响应于针对手势图像转换模式的选择操作，断开电子设备与显示设备之间的通信连接。所以，通过电子设备的摄像头得到骨骼点图，并将骨骼点图发送给显示设备的交互过程，充分的利用了电子设备的摄像头的拍摄能力和处理器运算能力的优势，完成指示动作的图像转化，转化为显示设备可以识别的骨骼点图，节省了显示设备的摄像头成本和处理器、内存性能成本，极大的节省了资源，极大的降低了显示设备的成本，让自身不带摄像头的显示设备均具有了隔空手势控制操作的功能和能力，解决了显示设备的人机交互的成本较高的技术问题。并且，当电子设备与显示设备配合进行隔空手势控制时，电子设备的放置位置可以根据用户位置随时移动和调整角度，也可以离用户更近，对肢体动作和手势动作拍摄的更加清楚，识别的精确度更高。由于电子设备识别后发送给显示设备的是骨骼点图，而非用户图像画面，而且用完即可断开电子设备与显示设备的连接，有效的避免了隐私和安全的顾虑。In an embodiment of the present application, if it is determined that the electronic device is in a gesture image conversion mode, the user's indication action is photographed by the camera of the electronic device to obtain multiple frames of indication images. The multiple frames of indication images are identified and processed to obtain an initial skeleton point diagram corresponding to each frame of indication image. If it is determined that the initial skeleton point diagrams corresponding to each frame of indication image are the same, the initial skeleton point diagram is determined to be a skeleton point diagram corresponding to the indication image. If it is determined that the initial skeleton point diagrams corresponding to each frame of indication image are different, it is determined that the initial skeleton point diagram of the first frame of indication image and the initial skeleton point diagram of the last frame of indication image are both skeleton point diagrams corresponding to the indication image. If there are multiple skeleton point diagrams, the skeleton point diagrams are sent to the display device in sequence according to the generation time of the skeleton point diagrams, so that the display device identifies and processes the skeleton point diagram based on the corresponding relationship between the preset skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction. In response to the selection operation for the gesture image conversion mode, the communication connection between the electronic device and the display device is disconnected. Therefore, the interactive process of obtaining the skeleton point map through the camera of the electronic device and sending the skeleton point map to the display device fully utilizes the advantages of the shooting ability and processor computing power of the camera of the electronic device to complete the image conversion of the indicated action and convert it into a skeleton point map that can be recognized by the display device, saving the camera cost of the display device and the processor and memory performance cost, greatly saving resources, greatly reducing the cost of the display device, and allowing the display device without a camera to have the function and ability of air gesture control operation, solving the technical problem of the high cost of human-computer interaction of the display device. In addition, when the electronic device cooperates with the display device to perform air gesture control, the placement position of the electronic device can be moved and adjusted at any time according to the user's position, and can also be closer to the user, so that the body movements and gestures can be captured more clearly and the recognition accuracy is higher. Since the electronic device sends the skeleton point map to the display device after recognition, rather than the user image screen, and the connection between the electronic device and the display device can be disconnected after use, privacy and security concerns are effectively avoided.

示例性的，图6为本申请实施例提供的又一种基于图像识别的交互控制方法的流程示意图，图7为本申请实施例提供的又一种基于图像识别的交互控制方法的流程示意图，其中，控制指令数据库中包括预设的骨骼点图与控制指令之间的对应关系、以及多个控制指令。Exemplarily, Figure 6 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application, and Figure 7 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application, wherein the control instruction database includes the correspondence between a preset skeleton point diagram and the control instruction, and multiple control instructions.

图8为本申请实施例提供的再一种基于图像识别的交互控制方法的流程示意图，应用于显示设备，显示设备与电子设备通信连接；如图8所示，该方法包括：FIG8 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application, which is applied to a display device, and the display device is communicatively connected with the electronic device; as shown in FIG8 , the method includes:

401、接收电子设备发送的骨骼点图；其中，骨骼点图是对指示图像进行识别处理得到的，指示图像是在电子设备处于手势图像转换模式下，通过电子设备的摄像头对用户的指示动作进行拍摄得到的。401. Receive a skeleton point diagram sent by an electronic device; wherein the skeleton point diagram is obtained by identifying and processing an indication image, and the indication image is obtained by photographing the user's indication action through a camera of the electronic device when the electronic device is in a gesture image conversion mode.

402、基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令。402. Based on the preset correspondence between the skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed, the target control instruction corresponding to the skeleton point diagram is determined, and the target control instruction is executed.

图9为本申请实施例提供的其他一种基于图像识别的交互控制方法的流程示意图，应用于显示设备，显示设备与电子设备通信连接；如图9所示，该方法包括：FIG9 is a flow chart of another interactive control method based on image recognition provided in an embodiment of the present application, which is applied to a display device, and the display device is communicatively connected with an electronic device; as shown in FIG9 , the method includes:

501、接收电子设备发送的骨骼点图；其中，骨骼点图是对指示图像进行识别处理得到的，指示图像是在电子设备处于手势图像转换模式下，通过电子设备的摄像头对用户的指示动作进行拍摄得到的。501. Receive a skeleton point diagram sent by an electronic device; wherein the skeleton point diagram is obtained by identifying and processing an indication image, and the indication image is obtained by photographing the user's indication action through a camera of the electronic device when the electronic device is in a gesture image conversion mode.

一个示例中，与指示图像对应的骨骼点图是根据第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图确定的，或者，与指示图像对应的骨骼点图是根据初始的骨骼点图确定的，其中，初始的骨骼点图是对每一帧指示图像进行识别处理得到的。In one example, the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map of the first frame indication image and the initial skeleton point map of the last frame indication image, or the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map, wherein the initial skeleton point map is obtained by performing recognition processing on each frame indication image.

一个示例中，骨骼点图的发送顺序是根据骨骼点图的生成时间确定的。 In one example, the order in which the skeleton point graphs are sent is determined according to the generation time of the skeleton point graphs.

一个示例中，电子设备与显示设备之间的连接方式包括有线连接和无线连接，其中，有线连接是电子设备的充电接口和显示设备的通用串行总线接口确定的；无线连接包括蓝牙无线技术、局域网协议、近场通信、或广域网服务器。In one example, the connection method between the electronic device and the display device includes wired connection and wireless connection, wherein the wired connection is determined by the charging interface of the electronic device and the universal serial bus interface of the display device; the wireless connection includes Bluetooth wireless technology, local area network protocol, near field communication, or wide area network server.

一个示例中，电子设备与显示设备之间的通信连接，是根据针对手势图像转换模式的选择操作断开的。In one example, the communication connection between the electronic device and the display device is disconnected according to a selection operation for the gesture image conversion mode.

502、基于预设的骨骼点图与控制指令之间的对应关系，通过显示设备中已开启的骨骼点图识别模式对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令。502. Based on the correspondence between the preset skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed by using the skeleton point diagram identification mode enabled in the display device to determine the target control instruction corresponding to the skeleton point diagram.

503、执行目标控制指令。503. Execute the target control instruction.

图10为本申请实施例提供的一种基于图像识别的交互控制装置的结构示意图，应用于电子设备，电子设备与显示设备通信连接；如图10所示，该装置包括：FIG10 is a schematic diagram of the structure of an interactive control device based on image recognition provided in an embodiment of the present application, which is applied to an electronic device, and the electronic device is communicatively connected with a display device; as shown in FIG10 , the device includes:

拍摄单元61，用于若确定电子设备处于手势图像转换模式下，则通过电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像。The shooting unit 61 is used to shoot the user's indicating action through the camera of the electronic device to obtain multiple frames of indicating images if it is determined that the electronic device is in the gesture image conversion mode.

识别单元62，用于对多帧指示图像进行识别处理，得到与指示图像对应的骨骼点图；其中，骨骼点图包括与指示图像中的指示动作对应的指示骨骼信息。The recognition unit 62 is used to perform recognition processing on multiple frames of indication images to obtain a skeleton point map corresponding to the indication image; wherein the skeleton point map includes indication skeleton information corresponding to the indication action in the indication image.

发送单元63，用于将骨骼点图发送至显示设备，以使显示设备基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令，并执行目标控制指令。The sending unit 63 is used to send the skeleton point diagram to the display device, so that the display device recognizes the skeleton point diagram based on the preset correspondence between the skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction.

本实施例的装置，可以执行上述方法中的技术方案，其具体实现过程和技术原理相同，此处不再赘述。The device of this embodiment can execute the technical solution in the above method. Its specific implementation process and technical principles are the same and will not be repeated here.

图11为本申请实施例提供的另一种基于图像识别的交互控制装置的结构示意图，在图10所示实施例的基础上，如图11所示，识别单元62，包括：FIG11 is a schematic diagram of the structure of another interactive control device based on image recognition provided in an embodiment of the present application. Based on the embodiment shown in FIG10 , as shown in FIG11 , the recognition unit 62 includes:

识别模块621，用于对多帧指示图像进行识别处理，得到与每一帧指示图像对应的初始的骨骼点图。The recognition module 621 is used to perform recognition processing on multiple frames of indication images to obtain an initial skeleton point map corresponding to each frame of indication image.

第一确定模块622，用于若确定每一帧指示图像对应的初始的骨骼点图均相同，则确定初始的骨骼点图为与指示图像对应的骨骼点图。The first determination module 622 is used to determine that the initial skeleton point map is the skeleton point map corresponding to the indication image if it is determined that the initial skeleton point map corresponding to each frame of the indication image is the same.

第二确定模块623，用于若确定每一帧指示图像对应的初始的骨骼点图均不相同，则确定第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图均为与指示图像对应的骨骼点图。The second determination module 623 is used to determine that the initial skeleton point map of the first frame of indication image and the initial skeleton point map of the last frame of indication image are both skeleton point maps corresponding to the indication image if it is determined that the initial skeleton point map corresponding to each frame of indication image is different.

一个示例中，发送单元63，具体用于：In one example, the sending unit 63 is specifically configured to:

若存在多个骨骼点图，则根据骨骼点图的生成时间，依次将骨骼点图发送至显示设备。If there are multiple skeleton point graphs, the skeleton point graphs are sent to the display device in sequence according to the generation time of the skeleton point graphs.

一个示例中，电子设备与显示设备之间的连接方式包括有线连接和无线连接，其中，有线连接是电子设备的充电接口和显示设备的通用串行总线接口之间的线路连接；无线连接包括蓝牙通讯技术、局域网协议、近场通信、或广域网服务器。 In one example, the connection method between the electronic device and the display device includes wired connection and wireless connection, wherein the wired connection is a line connection between the charging interface of the electronic device and the universal serial bus interface of the display device; the wireless connection includes Bluetooth communication technology, local area network protocol, near field communication, or wide area network server.

一个示例中，该装置还包括：In one example, the device further includes:

断开单元71，用于响应于针对手势图像转换模式的选择操作，断开电子设备与显示设备之间的通信连接。The disconnection unit 71 is used to disconnect the communication connection between the electronic device and the display device in response to the selection operation for the gesture image conversion mode.

图12为本申请实施例提供的又一种基于图像识别的交互控制装置的结构示意图，应用于显示设备，显示设备与电子设备通信连接；如图12所示，该装置包括：FIG12 is a schematic diagram of the structure of another interactive control device based on image recognition provided in an embodiment of the present application, which is applied to a display device, and the display device is communicatively connected with an electronic device; as shown in FIG12 , the device includes:

接收单元81，用于接收电子设备发送的骨骼点图；其中，骨骼点图是对指示图像进行识别处理得到的，指示图像是在电子设备处于手势图像转换模式下，通过电子设备的摄像头对用户的指示动作进行拍摄得到的。The receiving unit 81 is used to receive a skeleton point diagram sent by an electronic device; wherein the skeleton point diagram is obtained by identifying and processing an indication image, and the indication image is obtained by photographing the user's indication action through the camera of the electronic device when the electronic device is in a gesture image conversion mode.

确定单元82，用于基于预设的骨骼点图与控制指令之间的对应关系，对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令。The determination unit 82 is used to perform recognition processing on the skeleton point diagram based on the preset correspondence relationship between the skeleton point diagram and the control instruction, and determine the target control instruction corresponding to the skeleton point diagram.

执行单元83，用于执行目标控制指令。The execution unit 83 is used to execute the target control instruction.

在一个示例中，在图12所示实施例的基础上，确定单元82，具体用于：In one example, based on the embodiment shown in FIG. 12 , the determining unit 82 is specifically configured to:

基于预设的骨骼点图与控制指令之间的对应关系，通过显示设备中已开启的骨骼点图识别模式对骨骼点图进行识别处理，确定骨骼点图对应的目标控制指令。Based on the correspondence between the preset skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed through the skeleton point diagram recognition mode turned on in the display device to determine the target control instruction corresponding to the skeleton point diagram.

一个示例中，骨骼点图的发送顺序是根据骨骼点图的生成时间确定的。In one example, the order in which the skeleton point graphs are sent is determined according to the generation time of the skeleton point graphs.

图13为本申请实施例提供的一种电子设备的结构示意图，电子设备可以为手机、可外挂的摄像头模组等，如图13所示，电子设备包括：存储器91，处理器92。FIG13 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application. The electronic device may be a mobile phone, an external camera module, etc. As shown in FIG13 , the electronic device includes: a memory 91 and a processor 92 .

存储器91中存储有可在处理器92上运行的计算机程序。The memory 91 stores a computer program that can be executed on the processor 92 .

处理器92被配置为执行如上述实施例提供的方法。The processor 92 is configured to execute the method provided in the above embodiment.

电子设备还包括接收器93和发送器94。接收器93用于接收外部设备发送的指令和数据，发送器94用于向外部设备发送指令和数据。The electronic device further includes a receiver 93 and a transmitter 94. The receiver 93 is used to receive instructions and data sent by an external device, and the transmitter 94 is used to send instructions and data to the external device.

图14为本申请实施例提供的一种显示设备的结构示意图，如图14所示，显示设备包括：存储器101，处理器102。FIG. 14 is a schematic diagram of the structure of a display device provided in an embodiment of the present application. As shown in FIG. 14 , the display device includes: Memory 101, processor 102.

存储器101中存储有可在处理器102上运行的计算机程序。The memory 101 stores a computer program that can be executed on the processor 102 .

处理器102被配置为执行如上述实施例提供的方法。The processor 102 is configured to execute the method provided in the above embodiment.

显示设备还包括接收器103和发送器104。接收器103用于接收外部设备发送的指令和数据，发送器104用于向外部设备发送指令和数据。The display device further includes a receiver 103 and a transmitter 104. The receiver 103 is used to receive instructions and data sent by an external device, and the transmitter 104 is used to send instructions and data to the external device.

图15是本申请实施例提供的一种电子设备的框图，该电子设备可以是移动电话，计算机，数字广播终端，消息收发设备，游戏控制台，平板设备，医疗设备，健身设备，个人数字助理等。Figure 15 is a block diagram of an electronic device provided in an embodiment of the present application. The electronic device may be a mobile phone, a computer, a digital broadcast terminal, a message sending and receiving device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc.

装置1100可以包括以下一个或多个组件：处理组件1102，存储器1104，电源组件1106，多媒体组件1108，音频组件1110，输入/输出(I/O)接口1112，传感器组件1114，以及通信组件1116。Device 1100 may include one or more of the following components: a processing component 1102 , a memory 1104 , a power component 1106 , a multimedia component 1108 , an audio component 1110 , an input/output (I/O) interface 1112 , a sensor component 1114 , and a communication component 1116 .

处理组件1102通常控制装置1100的整体操作，诸如与显示，电话呼叫，数据通信，相机操作和记录操作相关联的操作。处理组件1102可以包括一个或多个处理器1120来执行指令，以完成上述的方法的全部或部分步骤。此外，处理组件1102可以包括一个或多个模块，便于处理组件1102和其他组件之间的交互。例如，处理组件1102可以包括多媒体模块，以方便多媒体组件1108和处理组件1102之间的交互。The processing component 1102 generally controls the overall operation of the device 1100, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 1102 may include one or more processors 1120 to execute instructions to perform all or part of the steps of the above-described method. In addition, the processing component 1102 may include one or more modules to facilitate the interaction between the processing component 1102 and other components. For example, the processing component 1102 may include a multimedia module to facilitate the interaction between the multimedia component 1108 and the processing component 1102.

存储器1104被配置为存储各种类型的数据以支持在装置1100的操作。这些数据的示例包括用于在装置1100上操作的任何应用程序或方法的指令，联系人数据，电话簿数据，消息，图片，视频等。存储器1104可以由任何类型的易失性或非易失性存储设备或者它们的组合实现，如静态随机存取存储器(SRAM)，电可擦除可编程只读存储器(EEPROM)，可擦除可编程只读存储器(EPROM)，可编程只读存储器(PROM)，只读存储器(ROM)，磁存储器，快闪存储器，磁盘或光盘。The memory 1104 is configured to store various types of data to support operations on the device 1100. Examples of such data include instructions for any application or method operating on the device 1100, contact data, phone book data, messages, pictures, videos, etc. The memory 1104 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.

电源组件1106为装置1100的各种组件提供电力。电源组件1106可以包括电源管理***，一个或多个电源，及其他与为装置1100生成、管理和分配电力相关联的组件。The power supply component 1106 provides power to the various components of the device 1100. The power supply component 1106 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to the device 1100.

多媒体组件1108包括在装置1100和用户之间的提供一个输出接口的屏幕。在一些实施例中，屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板，屏幕可以被实现为触摸屏，以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。触摸传感器可以不仅感测触摸或滑动动作的边界，而且还检测与触摸或滑动操作相关的持续时间和压力。在一些实施例中，多媒体组件1108包括一个前置摄像头和/或后置摄像头。当装置1100处于操作模式，如拍摄模式或视频模式时，前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜***或具有焦距和光学变焦能力。The multimedia component 1108 includes a screen that provides an output interface between the device 1100 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundaries of the touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1108 includes a front camera and/or a rear camera. When the device 1100 is in an operating mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

音频组件1110被配置为输出和/或输入音频信号。例如，音频组件1110包括一个麦克风(MIC)，当装置1100处于操作模式，如呼叫模式、记录模式和语音识别模式时，麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器1104或经由通信组件1116发送。在一些实施例中，音频组件1110还包括一个扬声器，用于输出音频信号。The audio component 1110 is configured to output and/or input audio signals. For example, the audio component 1110 includes a microphone (MIC), and when the device 1100 is in an operating mode, such as a call mode, a recording mode, and a speech recognition mode, the microphone is configured to receive an external audio signal. The received audio signal can be further stored in the memory 1104 or sent via the communication component 1116. In some embodiments, the audio component 1110 also includes a speaker for outputting audio signals.

I/O接口1112为处理组件1102和***接口模块之间提供接口，上述***接口模块可以是键盘，点击轮，按钮等。这些按钮可包括但不限于：主页按钮、音量按钮、启动按钮和锁定按钮。I/O interface 1112 provides an interface between processing component 1102 and peripheral interface modules. Be it a keyboard, a click wheel, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

传感器组件1114包括一个或多个传感器，用于为装置1100提供各个方面的状态评估。例如，传感器组件1114可以检测到装置1100的打开/关闭状态，组件的相对定位，例如组件为装置1100的显示器和小键盘，传感器组件1114还可以检测装置1100或装置1100一个组件的位置改变，用户与装置1100接触的存在或不存在，装置1100方位或加速/减速和装置1100的温度变化。传感器组件1114可以包括接近传感器，被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件1114还可以包括光传感器，如CMOS或CCD图像传感器，用于在成像应用中使用。在一些实施例中，该传感器组件1114还可以包括加速度传感器，陀螺仪传感器，磁传感器，压力传感器或温度传感器。The sensor assembly 1114 includes one or more sensors for providing various aspects of the status assessment of the device 1100. For example, the sensor assembly 1114 can detect the open/closed state of the device 1100, the relative positioning of components, such as the display and keypad of the device 1100, the sensor assembly 1114 can also detect the position change of the device 1100 or a component of the device 1100, the presence or absence of user contact with the device 1100, the orientation or acceleration/deceleration of the device 1100, and the temperature change of the device 1100. The sensor assembly 1114 can include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1114 can also include an optical sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1114 can also include an accelerometer, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

通信组件1116被配置为便于装置1100和其他设备之间有线或无线方式的通信。装置1100可以接入基于通信标准的无线网络，如WiFi，2G或3G，或它们的组合。在一个示例性实施例中，通信组件1116经由广播信道接收来自外部广播管理***的广播信号或广播相关信息。在一个示例性实施例中，通信组件1116还包括近场通信(NFC)模块，以促进短程通信。例如，在NFC模块可基于射频识别(RFID)技术，红外数据协会(IrDA)技术，超宽带(UWB)技术，蓝牙(BT)技术和其他技术来实现。The communication component 1116 is configured to facilitate wired or wireless communication between the device 1100 and other devices. The device 1100 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1116 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 1116 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.

在示例性实施例中，装置1100可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现，用于执行上述方法。In an exemplary embodiment, the device 1100 can be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components to perform the above-mentioned methods.

在示例性实施例中，还提供了一种包括指令的非临时性计算机可读存储介质，例如包括指令的存储器1104，上述指令可由装置1100的处理器1120执行以完成上述方法。例如，非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions is also provided, such as a memory 1104 including instructions, and the instructions can be executed by the processor 1120 of the device 1100 to perform the above method. For example, the non-transitory computer-readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, etc.

本申请实施例还提供了一种非临时性计算机可读存储介质，当该存储介质中的指令由电子设备的处理器执行时，使得电子设备能够执行上述实施例提供的方法。An embodiment of the present application also provides a non-temporary computer-readable storage medium. When the instructions in the storage medium are executed by a processor of an electronic device, the electronic device can execute the method provided in the above embodiment.

本申请实施例还提供了一种计算机程序产品，计算机程序产品包括：计算机程序，计算机程序存储在可读存储介质中，电子设备的至少一个处理器可以从可读存储介质读取计算机程序，至少一个处理器执行计算机程序使得电子设备执行上述任一实施例提供的方案。An embodiment of the present application also provides a computer program product, which includes: a computer program, which is stored in a readable storage medium, and at least one processor of an electronic device can read the computer program from the readable storage medium, and at least one processor executes the computer program so that the electronic device executes the solution provided by any of the above embodiments.

本领域技术人员在考虑说明书及实践这里公开的发明后，将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化，这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的，本公开的真正范围和精神由下面的权利要求书指出。Those skilled in the art will readily appreciate other embodiments of the present disclosure after considering the specification and practicing the invention disclosed herein. This application is intended to cover any variations, uses or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or customary techniques in the art that are not disclosed in the present disclosure. The specification and examples are intended to be exemplary only, and the true scope and spirit of the present disclosure are indicated by the following claims.

应当理解的是，本公开并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求书来限制。 It should be understood that the present disclosure is not limited to the exact structures that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

一种基于图像识别的交互控制方法，其特征在于，应用于电子设备，所述电子设备与显示设备通信连接；所述方法包括：An interactive control method based on image recognition, characterized in that it is applied to an electronic device, wherein the electronic device is communicatively connected with a display device; the method comprises:

若确定所述电子设备处于手势图像转换模式下，则通过所述电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像；If it is determined that the electronic device is in the gesture image conversion mode, photographing the user's indicating action through a camera of the electronic device to obtain multiple frames of indicating images;

对多帧所述指示图像进行识别处理，得到与所述指示图像对应的骨骼点图；其中，所述骨骼点图包括与所述指示图像中的指示动作对应的指示骨骼信息；Performing recognition processing on multiple frames of the indication image to obtain a skeleton point map corresponding to the indication image; wherein the skeleton point map includes indication skeleton information corresponding to the indication action in the indication image;

将所述骨骼点图发送至所述显示设备，以使所述显示设备基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令，并执行所述目标控制指令。The skeleton point diagram is sent to the display device, so that the display device recognizes and processes the skeleton point diagram based on the preset correspondence between the skeleton point diagram and the control instruction, determines the target control instruction corresponding to the skeleton point diagram, and executes the target control instruction.
根据权利要求1所述的方法，其特征在于，对多帧所述指示图像进行识别处理，得到与所述指示图像对应的骨骼点图，包括：The method according to claim 1 is characterized in that the recognition processing is performed on multiple frames of the indication image to obtain a skeleton point map corresponding to the indication image, comprising:

对多帧指示图像进行识别处理，得到与每一帧指示图像对应的初始的骨骼点图；Performing recognition processing on multiple frames of indication images to obtain an initial skeleton point map corresponding to each frame of indication image;

若确定每一帧指示图像对应的初始的骨骼点图均相同，则确定初始的所述骨骼点图为与所述指示图像对应的骨骼点图；If it is determined that the initial skeleton point graphs corresponding to each frame of the indication image are the same, then the initial skeleton point graph is determined to be the skeleton point graph corresponding to the indication image;

若确定每一帧指示图像对应的初始的骨骼点图均不相同，则确定第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图均为与所述指示图像对应的骨骼点图。If it is determined that the initial skeleton point graphs corresponding to each frame of the indication image are different, it is determined that the initial skeleton point graph of the first frame of the indication image and the initial skeleton point graph of the last frame of the indication image are both skeleton point graphs corresponding to the indication image.
根据权利要求1所述的方法，其特征在于，所述骨骼点图对应的目标控制指令，是基于预设的骨骼点图与控制指令之间的对应关系，通过所述显示设备中已开启的骨骼点图识别模式对所述骨骼点图进行识别处理得到的。The method according to claim 1 is characterized in that the target control instruction corresponding to the skeleton point diagram is obtained by identifying and processing the skeleton point diagram through the skeleton point diagram recognition mode turned on in the display device based on the correspondence between the preset skeleton point diagram and the control instruction.
根据权利要求1所述的方法，其特征在于，将所述骨骼点图发送至所述显示设备，包括：The method according to claim 1, characterized in that sending the skeleton point map to the display device comprises:

若存在多个所述骨骼点图，则根据所述骨骼点图的生成时间，依次将所述骨骼点图发送至所述显示设备。If there are a plurality of the skeleton point graphs, the skeleton point graphs are sent to the display device in sequence according to the generation time of the skeleton point graphs.
根据权利要求1所述的方法，其特征在于，所述指示动作包括手势动作和/或肢体动作。The method according to claim 1 is characterized in that the indicating action includes a gesture action and/or a body movement.
根据权利要求1所述的方法，其特征在于，所述电子设备与所述显示设备之间的连接方式包括有线连接和无线连接，其中，所述有线连接是电子设备的充电接口和显示设备的通用串行总线接口之间的线路连接；所述无线连接包括蓝牙通讯技术、局域网协议、近场通信、或广域网服务器。The method according to claim 1 is characterized in that the connection mode between the electronic device and the display device includes a wired connection and a wireless connection, wherein the wired connection is a line connection between a charging interface of the electronic device and a universal serial bus interface of the display device; the wireless connection includes Bluetooth communication technology, a local area network protocol, a near field communication, or a wide area network server.
根据权利要求1-6任一项所述的方法，其特征在于，所述方法还包括：The method according to any one of claims 1 to 6, characterized in that the method further comprises:

响应于针对所述手势图像转换模式的选择操作，断开所述电子设备与所述显示设备之间的通信连接。In response to a selection operation for the gesture image conversion mode, the communication connection between the electronic device and the display device is disconnected.
一种基于图像识别的交互控制方法，其特征在于，应用于显示设备，所述显示设备与电子设备通信连接；所述方法包括：An interactive control method based on image recognition, characterized in that it is applied to a display device, wherein the display device is communicatively connected with an electronic device; the method comprises:

接收所述电子设备发送的骨骼点图；其中，所述骨骼点图是对指示图像进行识别处理得到的，所述指示图像是在所述电子设备处于手势图像转换模式下，通过所述电子设备的摄像头对用户的指示动作进行拍摄得到的；Receive a skeleton point diagram sent by the electronic device; wherein the skeleton point diagram is obtained by identifying and processing an indication image, and the indication image is obtained by the camera of the electronic device when the electronic device is in a gesture image conversion mode. The head is used to capture the user's indicated actions;

基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令，并执行所述目标控制指令。Based on the correspondence between the preset skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed, the target control instruction corresponding to the skeleton point diagram is determined, and the target control instruction is executed.
根据权利要求8所述的方法，其特征在于，基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令，包括：The method according to claim 8 is characterized in that, based on the correspondence between a preset skeleton point diagram and a control instruction, the skeleton point diagram is identified and processed to determine the target control instruction corresponding to the skeleton point diagram, comprising:

基于预设的骨骼点图与控制指令之间的对应关系，通过所述显示设备中已开启的骨骼点图识别模式对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令。Based on the correspondence between the preset skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed by the skeleton point diagram identification mode turned on in the display device to determine the target control instruction corresponding to the skeleton point diagram.
根据权利要求8所述的方法，其特征在于，与所述指示图像对应的骨骼点图是根据第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图确定的，或者，与所述指示图像对应的骨骼点图是根据初始的所述骨骼点图确定的，其中，初始的所述骨骼点图是对每一帧所述指示图像进行识别处理得到的。The method according to claim 8 is characterized in that the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map of the first frame indication image and the initial skeleton point map of the last frame indication image, or the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map, wherein the initial skeleton point map is obtained by performing recognition processing on each frame of the indication image.
根据权利要求8所述的方法，其特征在于，所述骨骼点图的发送顺序是根据所述骨骼点图的生成时间确定的。The method according to claim 8 is characterized in that the sending order of the skeleton point graph is determined according to the generation time of the skeleton point graph.
根据权利要求8所述的方法，其特征在于，所述指示动作包括手势动作和/或肢体动作。The method according to claim 8 is characterized in that the indicating action includes a gesture action and/or a body movement.
根据权利要求8所述的方法，其特征在于，所述电子设备与所述显示设备之间的连接方式包括有线连接和无线连接，其中，所述有线连接是电子设备的充电接口和显示设备的通用串行总线接口确定的；所述无线连接包括蓝牙无线技术、局域网协议、近场通信、或广域网服务器。The method according to claim 8 is characterized in that the connection mode between the electronic device and the display device includes a wired connection and a wireless connection, wherein the wired connection is determined by a charging interface of the electronic device and a universal serial bus interface of the display device; and the wireless connection includes Bluetooth wireless technology, a local area network protocol, near field communication, or a wide area network server.
根据权利要求8-13任一项所述的方法，其特征在于，所述电子设备与所述显示设备之间的通信连接，是根据针对所述手势图像转换模式的选择操作断开的。The method according to any one of claims 8 to 13 is characterized in that the communication connection between the electronic device and the display device is disconnected according to a selection operation for the gesture image conversion mode.
一种基于图像识别的交互控制装置，其特征在于，应用于电子设备，所述电子设备与显示设备通信连接；所述装置包括：An interactive control device based on image recognition, characterized in that it is applied to an electronic device, wherein the electronic device is communicatively connected with a display device; the device comprises:

拍摄单元，用于若确定所述电子设备处于手势图像转换模式下，则通过所述电子设备的摄像头对用户的指示动作进行拍摄，得到多帧指示图像；a photographing unit, configured to photograph the user's indicating action through a camera of the electronic device to obtain multiple frames of indicating images if it is determined that the electronic device is in a gesture image conversion mode;

识别单元，用于对多帧所述指示图像进行识别处理，得到与所述指示图像对应的骨骼点图；其中，所述骨骼点图包括与所述指示图像中的指示动作对应的指示骨骼信息；A recognition unit, used for performing recognition processing on multiple frames of the indication image to obtain a skeleton point map corresponding to the indication image; wherein the skeleton point map includes indication skeleton information corresponding to the indication action in the indication image;

发送单元，用于将所述骨骼点图发送至所述显示设备，以使所述显示设备基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令，并执行所述目标控制指令。A sending unit is used to send the skeleton point diagram to the display device, so that the display device can identify the skeleton point diagram based on the corresponding relationship between the preset skeleton point diagram and the control instruction, determine the target control instruction corresponding to the skeleton point diagram, and execute the target control instruction.
一种基于图像识别的交互控制装置，其特征在于，应用于显示设备，所述显示设备与电子设备通信连接；所述装置包括：An interactive control device based on image recognition, characterized in that it is applied to a display device, and the display device is communicatively connected with an electronic device; the device comprises:

接收单元，用于接收所述电子设备发送的骨骼点图；其中，所述骨骼点图是对指示图像进行识别处理得到的，所述指示图像是在所述电子设备处于手势图像转换模式下，通过所述电子设备的摄像头对用户的指示动作进行拍摄得到的； A receiving unit, configured to receive a skeleton point diagram sent by the electronic device; wherein the skeleton point diagram is obtained by performing recognition processing on an indication image, and the indication image is obtained by photographing an indication action of a user through a camera of the electronic device when the electronic device is in a gesture image conversion mode;

确定单元，用于基于预设的骨骼点图与控制指令之间的对应关系，对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令；A determination unit, configured to perform recognition processing on the skeleton point diagram based on a preset correspondence relationship between the skeleton point diagram and the control instruction, and determine a target control instruction corresponding to the skeleton point diagram;

执行单元，用于执行所述目标控制指令。An execution unit is used to execute the target control instruction.
根据权利要求16所述的装置，其特征在于，所述确定单元，具体用于：The device according to claim 16, characterized in that the determining unit is specifically configured to:

基于预设的骨骼点图与控制指令之间的对应关系，通过所述显示设备中已开启的骨骼点图识别模式对所述骨骼点图进行识别处理，确定所述骨骼点图对应的目标控制指令。Based on the correspondence between the preset skeleton point diagram and the control instruction, the skeleton point diagram is identified and processed by the skeleton point diagram identification mode turned on in the display device to determine the target control instruction corresponding to the skeleton point diagram.
根据权利要求16所述的装置，其特征在于，与所述指示图像对应的骨骼点图是根据第一帧指示图像的初始的骨骼点图与最后一帧指示图像的初始的骨骼点图确定的，或者，与所述指示图像对应的骨骼点图是根据初始的所述骨骼点图确定的，其中，初始的所述骨骼点图是对每一帧所述指示图像进行识别处理得到的。The device according to claim 16 is characterized in that the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map of the first frame indication image and the initial skeleton point map of the last frame indication image, or the skeleton point map corresponding to the indication image is determined based on the initial skeleton point map, wherein the initial skeleton point map is obtained by performing recognition processing on each frame of the indication image.
根据权利要求16所述的装置，其特征在于，所述骨骼点图的发送顺序是根据所述骨骼点图的生成时间确定的。The device according to claim 16 is characterized in that the sending order of the skeleton point map is determined according to the generation time of the skeleton point map.
根据权利要求16所述的装置，其特征在于，所述指示动作包括手势动作和/或肢体动作。The device according to claim 16 is characterized in that the indicating action includes a gesture action and/or a body movement.
根据权利要求16所述的装置，其特征在于，所述电子设备与所述显示设备之间的连接方式包括有线连接和无线连接，其中，所述有线连接是电子设备的充电接口和显示设备的通用串行总线接口确定的；所述无线连接包括蓝牙无线技术、局域网协议、近场通信、或广域网服务器。The device according to claim 16 is characterized in that the connection mode between the electronic device and the display device includes a wired connection and a wireless connection, wherein the wired connection is determined by a charging interface of the electronic device and a universal serial bus interface of the display device; and the wireless connection includes Bluetooth wireless technology, a local area network protocol, near field communication, or a wide area network server.
根据权利要求16-21任一项所述的装置，其特征在于，所述电子设备与所述显示设备之间的通信连接，是根据针对所述手势图像转换模式的选择操作断开的。The device according to any one of claims 16-21 is characterized in that the communication connection between the electronic device and the display device is disconnected according to a selection operation for the gesture image conversion mode.
一种电子设备，其特征在于，包括存储器、处理器，所述存储器中存储有可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现上述权利要求1-7中任一项所述的方法。An electronic device, characterized in that it includes a memory and a processor, wherein the memory stores a computer program that can be run on the processor, and when the processor executes the computer program, it implements the method described in any one of claims 1 to 7.
一种显示设备，其特征在于，包括存储器、处理器，所述存储器中存储有可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现上述权利要求8-14中任一项所述的方法。A display device, characterized in that it includes a memory and a processor, wherein the memory stores a computer program that can be run on the processor, and when the processor executes the computer program, it implements the method described in any one of claims 8 to 14.
一种计算机可读存储介质，其特征在于，所述计算机可读存储介质中存储有计算机执行指令，所述计算机执行指令被处理器执行时用于实现如权利要求1-7任一项所述的方法，或者，实现权利要求8-14中任一项所述的方法。A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, they are used to implement the method according to any one of claims 1 to 7, or to implement the method according to any one of claims 8 to 14.
一种计算机程序产品，其特征在于，包括计算机程序，该计算机程序被处理器执行时实现权利要求1-7中任一项所述的方法，或者，实现权利要求8-14中任一项所述的方法。 A computer program product, characterized in that it comprises a computer program, which, when executed by a processor, implements the method according to any one of claims 1 to 7, or implements the method according to any one of claims 8 to 14.