CN111366916B

CN111366916B - Method and device for determining distance between interaction target and robot and electronic equipment

Info

Publication number: CN111366916B
Application number: CN202010096652.5A
Authority: CN
Inventors: 刘非非
Original assignee: Beijing Ruisi Aotu Intelligent Technology Co ltd; Shandong Ruisi Aotu Intelligent Technology Co ltd
Current assignee: Beijing Ruisi Aotu Intelligent Technology Co ltd; Shandong Ruisi Aotu Intelligent Technology Co ltd
Priority date: 2020-02-17
Filing date: 2020-02-17
Publication date: 2021-04-06
Anticipated expiration: 2040-02-17
Also published as: CN111366916A

Abstract

The embodiment of the specification relates to a method, a device and electronic equipment for determining the distance between an interaction target and a robot, wherein the method comprises the following steps: the method comprises the steps of firstly determining a rectangular frame area where an interactive target is located by adopting a human body detection algorithm, then extracting depth data of the rectangular frame area from a depth map, then reducing analysis complexity respectively in a line scanning or column scanning mode, filtering the depth data obtained by scanning statistics, further accurately determining the distance between the interactive target and a robot according to the occurrence frequency of the depth data obtained after filtering, and meanwhile, the method can be suitable for multiple poses and scenes while collecting and determining the distance in verification.

Description

Method and device for determining distance between interaction target and robot and electronic equipment

Technical Field

The embodiment of the specification relates to the technical field of artificial intelligence, in particular to a method and a device for determining a distance between an interaction target and a robot and electronic equipment.

Background

With the rise of artificial intelligence wave, robots are rapidly developed, more and more robots begin to enter the visual field of people, and especially, the robots which are service type robots of main artificial intelligence products occupy an important position.

In the field of artificial intelligence interaction, the distance between an interaction target for performing message interaction with an entity robot and the robot influences the interaction response of the entity robot. There are two main types of methods for obtaining the distance between an interaction target and a robot: 1) rough estimation, namely calculating the gravity center of the human body by a mathematical method according to the structure of the human body, and taking the average value of depth values in a gravity center area as the distance of the interactive person; 2) and performing fine calculation, namely accurately acquiring contour information and limb information of the human body by using a background removal or human body modeling method, and then taking the average value of the depth values of the upper trunk area as the distance of the interactive person.

However, the above method 1), normally, can obtain the distance, but is difficult to be applied to some special postures or special scenes due to the non-rigid nature of the human body, such as sitting, crossing the waist, waving the hands, etc.; the method 2) can give a distance more accurately, but the complexity is higher, and the method is not suitable for scenes with high real-time requirements.

Therefore, it is desirable to find a new method for determining the distance between the interaction target and the robot.

Disclosure of Invention

The embodiment of the specification provides a method and a device for determining a distance between an interactive target and a robot, and electronic equipment, so that the distance determination accuracy is improved while the distance between the interactive target and the robot is determined in verification.

In order to solve the above technical problem, the embodiments of the present specification adopt the following technical solutions:

in a first aspect, a method of determining a distance between an interaction target and a robot is provided, the method comprising:

acquiring an image of an interactive target in a shooting scene and a depth map aligned with the image;

extracting depth data of a rectangular frame region where the interaction target is located based on the image and the depth map;

scanning the rectangular frame region in rows or columns based on a preset first tolerance, and counting the depth data of the highest frequency in each row or each column;

filtering the depth data obtained by statistics;

scanning the depth data obtained by statistics based on a preset second tolerance, and counting the depth data with the highest frequency and the next highest frequency;

judging whether the size ratio of the rectangular frame region meets a preset human body size ratio or not;

if yes, determining the depth data with the highest frequency in the finally counted depth data as the distance between the interaction target and the robot;

otherwise, determining the depth data with the second highest frequency in the finally counted depth data as the distance between the interactive target and the robot.

In a second aspect, an apparatus for determining a distance between an interaction target and a robot is provided, the apparatus comprising:

the acquisition module is used for acquiring an image of the interactive target in a shooting scene and a depth map aligned with the image;

the extraction module is used for extracting the depth data of the rectangular frame region where the interaction target is located based on the image and the depth map;

the statistical module is used for scanning the rectangular frame region in rows or columns based on preset first tolerance and counting the depth data of the highest frequency in each row or each column;

the filtering module is used for filtering the depth data obtained by statistics;

the statistical module is further used for scanning the filtered depth data based on a preset second tolerance and counting the depth data with the highest frequency and the next highest frequency;

the judging module is used for judging whether the size ratio of the rectangular frame area meets a preset human body size ratio;

the determining module is used for determining the depth data with the highest frequency in the finally counted depth data as the distance between the interactive target and the robot when the judging result of the judging module is yes; and the number of the first and second groups,

and the depth data with the second highest frequency in the finally counted depth data is determined as the distance between the interactive target and the robot when the judgment result of the judgment module is negative.

In a third aspect, an electronic device is provided, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program being executed by the processor for performing the method of the first aspect.

In a fourth aspect, there is provided a computer readable storage medium storing one or more programs which, when executed by an electronic device comprising a plurality of application programs, cause the electronic device to perform the method of the first aspect.

The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:

according to the technical scheme, the rectangular frame area where the interactive target is located is determined by adopting a human body detection algorithm, the depth data of the rectangular frame area is extracted from the depth map, the analysis complexity is reduced according to a line scanning mode or a column scanning mode, the depth data obtained by scanning statistics are filtered, the distance between the interactive target and the robot is accurately determined according to the occurrence frequency of the depth data obtained after filtering, and the method can be suitable for multiple poses and scenes while the distance is collected and determined in verification.

Drawings

In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative efforts.

Fig. 1 is a schematic diagram illustrating steps of a method for determining a distance between an interaction target and a robot according to an embodiment of the present disclosure;

fig. 2a is a schematic diagram of a scheme step implemented by first performing line scanning on a rectangular frame region according to an embodiment of the present disclosure;

FIG. 2b is a schematic diagram of the steps of the solution implemented by first performing a line-scan on a rectangular frame area according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an apparatus 300 for determining a distance between an interaction target and a robot according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of this specification.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort belong to the protection scope of the embodiments in the present specification.

The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.

Example one

Referring to fig. 1, a schematic diagram of steps of a method for determining a distance between an interaction target and a robot provided in an embodiment of the present specification is shown, where the method includes the following steps:

step 102: and acquiring an image of the interactive target in a shooting scene and a depth map aligned with the image.

The interaction target may be a person performing message interaction with a robot (hereinafter referred to as a detection robot) or an interactive robot having a similar shape structure with the person, for example, a head, four limbs, a trunk, and the like.

It should be understood that a camera for acquiring images and a processor for processing the images are integrated on the inspection robot. In specific implementation, the camera can be used for acquiring the RGB image of the interactive target in the shooting scene in real time, and meanwhile, the depth map aligned with the RGB image is acquired. In a specific implementation, the existing binocular camera may be used to obtain the depth map aligned with the RGB image, and in fact, the depth map aligned with the RGB image of the interactive target may also be obtained in other ways of determining the depth map in this embodiment of the present specification.

Step 104: and extracting the depth data of the rectangular frame region where the interaction target is located based on the image and the depth map.

An implementation solution, in the step 104, when extracting the depth data of the rectangular frame region where the interaction target is located based on the image and the depth map, specifically includes the following steps:

firstly, determining a rectangular frame area corresponding to the interactive target from the image by using a human body detection algorithm.

In this embodiment of the present specification, the human body detection algorithm may be that a background modeling algorithm is used to extract a moving foreground object (i.e., an interactive object), then a classifier is used to classify the interactive object, and whether the interactive object is included is determined, which generally refers to a moving person or a moving robot. Among the commonly used background modeling algorithms are: the basic idea is to obtain a background model through frame learning, and then compare a current frame with a background frame to obtain a moving target, namely a rectangular frame region where an interactive target in an image is located.

And secondly, extracting the depth data of the rectangular frame area from the depth map aligned with the image.

After the rectangular frame region corresponding to the interaction target is determined, the depth data of the rectangular frame region where the interaction target is located can be extracted from the depth map based on the position coordinates of the rectangular frame region and the depth data of each pixel point in the depth map aligned with the RGB image. In specific implementation, each pixel point may correspond to a position coordinate, and the position coordinate includes a row coordinate, a vertical coordinate, and a depth coordinate in the depth map. Wherein, the value corresponding to the depth coordinate can be used as the depth data of the pixel point.

Therefore, the depth data of the rectangular frame region where the interaction target is located can be accurately extracted from the depth map based on a human body detection algorithm.

Step 106: and scanning the rectangular frame region in rows or columns based on a preset first tolerance, and counting the depth data of the highest frequency in each row or each column.

An implementation scheme, in step 106, when performing row or column scanning on the rectangular frame region based on a preset first tolerance, and counting the depth data of the highest frequency in each row or each column, specifically includes: based on a preset first tolerance, performing row scanning or column scanning on the rectangular frame region in sequence, and determining the frequency of depth data in each row or each column; and counting the depth data corresponding to the highest frequency in each row or column.

In specific implementation, if a line is scanned first, for each line of depth data, if the currently scanned depth data is the first depth data of the currently located line, counting one frequency for the depth data; if not, judging whether the difference between the currently scanned depth data and the depth data corresponding to the previous counting is smaller than a preset first tolerance or not, if so, accumulating a frequency for the depth data corresponding to the previous counting, and if not, counting a frequency for the depth data; until the current line is scanned; alternatively, the first and second electrodes may be,

if the column is scanned first, counting one frequency for each column of depth data if the currently scanned depth data is the first depth data of the currently located column; if not, judging whether the difference between the currently scanned depth data and the depth data corresponding to the previous counting is smaller than a preset first tolerance or not, if so, accumulating a frequency for the depth data corresponding to the previous counting, and if not, counting a frequency for the depth data; until the current column is scanned.

For example, the pixel points in the rectangular frame region may be line-scanned, the frequency of occurrence of each depth data in each line is counted, and the depth data corresponding to the highest frequency in each line is counted. Assuming that the rectangular frame area has 10 rows and 10 columns, respectively, each row is scanned, and the depth data of 1-10 columns (unit may be mm or other, and is used here as an example and not a limitation): 1713. 1731, 1750, 1761, 1767, 1782, 1765, 1745, 1795, 1789. If the tolerance is set to 2cm, then the process may be: scanning the first depth data 1713, wherein the accumulation frequency is 1; scanning a second depth data 1731, wherein the tolerance with the first depth data is within 2cm, if the tolerance condition is met, the accumulated frequency is 2, and at this time, calculating the depth data once as (1731 × 1+1713 × 1)/2 ═ 1722; scanning a third depth data 1750, wherein the tolerance with the calculated depth data 1722 is more than 2cm, and the single counting frequency is 1; scanning a fourth depth data 1761, wherein the tolerance with the third depth data 1750 is less than 2, the accumulated frequency is 2, and the corresponding depth data (1750 × 1+1761 × 1)/2 is 1755.5; the fifth depth data 1767 is scanned with a tolerance of less than 2 from the calculated depth data 1755.5, the cumulative frequency is 3, and the corresponding depth value (1755.5 x 2+1767 x 1)/3 is 1759.3, and so on, and finally the scan determines that the depth data appearing the highest frequency in the first row is 1759.3, and the frequency of appearance is 3. Scanning the remaining 2-10 rows separately in this manner, it can be finally determined: the depth data with the highest frequency of occurrence in the second row is 1726, the frequency of occurrence is 5, the depth data with the highest frequency of occurrence in the third row is 1751, the frequency of occurrence is 2, the depth data with the highest frequency of occurrence in the fourth row is 1754.5, the frequency of occurrence is 4, the depth data with the highest frequency of occurrence in the fifth row is 1756.5, the frequency of occurrence is 4, the depth data with the highest frequency of occurrence in the sixth row is 1764.5, the frequency of occurrence is 3, the depth data with the highest frequency of occurrence in the seventh row is 1774.5, the frequency of occurrence is 6, the depth data with the highest frequency of occurrence in the eighth row is 1754.5, the frequency of occurrence is 4, the depth data with the highest frequency of occurrence in the ninth row is 1756.5, the frequency of occurrence is 3, the depth data with the highest frequency of occurrence in the tenth row is 1756.5, and the frequency of occurrence is 2.

The above example is a scan-first behavior example, and the manner of scanning rows is similar, which is not described herein. It should be understood that in the statistics of the depth data, the frequency of occurrence of each depth data is counted at the same time. In addition, in the embodiment of the present specification, the depth data obtained by statistics may be scanned real depth data, or may be an average value obtained by weighting processing based on at least two pieces of scanned real depth data.

After the depth data corresponding to the highest frequency in each row or column is counted in step 106, considering that there are many dirty data due to the non-rigidity of the human body and the particularity of the human body structure, a subsequent series of analysis cannot be performed, and therefore, in order to ensure the accuracy of the scanning result, the method further includes:

step 108: and filtering the depth data obtained by statistics.

In the embodiments of the present description, the specific manner of filtering may include: filtering out depth data with the ratio of the frequency to the number of columns in the depth data obtained by counting all rows smaller than a threshold value; or filtering out the depth data with the ratio of the frequency to the row number smaller than the threshold value in the depth data obtained by counting all the columns.

For example, if the threshold is 50%, if the frequency ratio of the depth data obtained by counting the scanning results appearing in the scanning rows or columns is smaller than the threshold, the depth data is invalid and can be filtered out, so that the accuracy of the scanning results obtained by counting is ensured, the subsequent processing of invalid data is reduced, and the processing efficiency is improved.

It should be understood that after the line scanning is performed first, the counted number of depth data is the total number of scanned lines, and after the column scanning is performed first, the counted number of depth data is the total number of scanned columns.

Step 110: and scanning the filtered depth data based on a preset second tolerance, and counting the depth data with the highest frequency and the next highest frequency.

In the concrete implementation: counting a frequency for the depth data if the currently scanned depth data is the first depth data in the filtered depth data; if the currently scanned depth data is not the first depth data in the filtered depth data, judging whether the difference between the currently scanned depth data and the depth data corresponding to the last counting is smaller than a preset second tolerance, if so, accumulating a frequency for the depth data corresponding to the last counting, and if not, counting a frequency for the depth data until the depth data obtained by counting is scanned; and (5) counting the depth data with the highest frequency and the next highest frequency after the scanning.

In another embodiment of the present specification, when depth data with the highest frequency and the second highest frequency after the current scanning is counted, the depth data obtained through counting may be combined into a one-dimensional vector according to a statistical manner during the first line scanning or the first column scanning, and the depth data may be sorted according to the frequency of occurrence of each depth data in the one-dimensional vector from high to low.

Still based on the above example, the depth data with the frequency less than or equal to 2 in the depth data obtained by statistics is filtered out: [1759.3,1726,1754.5,1756.5,1764.5,1774.5,1754.5,1756.5], then, the line scan is performed again according to the above example, and assuming that the preset second tolerance of the scan is also 2cm, the calculated frequency of 1763.5 occurrence is 5 times, the frequency is the highest, the frequency of 1759.3 occurrence is 2, the frequency is the high, and the frequency of 1726 occurrence is 1, the frequency is the lowest, a one-dimensional vector [1759.3,1726,1763.5] can be formed, and then, the sequence can be performed according to the frequency, so as to obtain: f (1763.5) > f (1759.3) > f (1726), where f (x) represents frequency and x represents depth data, and where f (x) may specifically represent frequency of the depth data x.

It should be understood that the above examples are only illustrative, and in a specific embodiment, the amount of depth data is large, and the number of rows and columns is not limited to the embodiment.

Step 112: and judging whether the size ratio of the rectangular frame region meets a preset human body size ratio. If so, go to step 114; otherwise, step 116 is performed.

In specific implementation, it may be determined whether the aspect ratio of the rectangular frame region meets a preset human body aspect ratio, or whether the aspect ratio of the rectangular frame region meets the preset human body aspect ratio.

It should be understood that in the embodiments of the present specification, the preset body aspect ratio or the preset body aspect ratio may be set based on an empirical value, and specific values are not exemplified here.

Step 114: and determining the depth data with the highest frequency in the finally counted depth data as the distance between the interactive target and the robot.

In the embodiment of the present specification, if the aspect ratio of the rectangular frame region satisfies the preset human body aspect ratio, it indicates that the interaction target in the rectangular frame region is in a normal pose (standing), and there is no abnormal motion, at this time, the highest frequency depth data in the depth data obtained through statistics is taken as the actual distance, and based on the above example, the highest frequency 50 may be taken as the distance between the interaction target and the robot.

Step 116: and determining the depth data with the second highest frequency in the finally counted depth data as the distance between the interactive target and the robot.

On the contrary, if the aspect ratio of the rectangular frame region does not satisfy the preset human body aspect ratio, it indicates that the interaction target in the rectangular frame region is an abnormal pose (for example, a waist, a sitting posture, a hand stretching, a leg stretching, an arm waving, and the like), at this time, the depth data with the second highest frequency in the depth data obtained by statistics is taken as the actual distance, and based on the above example, the 52 with the second highest frequency can be taken as the distance between the interaction target and the robot.

The scheme for determining the distance between the interactive person and the robot in the specification is described in two scanning modes respectively.

Referring to fig. 2a, taking the example of performing line scanning on the rectangular frame area first, the method may include the following steps:

step 202 a: and acquiring an RGB image of the interactive person.

Step 204 a: and determining a rectangular frame area of the interactive person in the RGB image based on a human body detection algorithm.

Step 206 a: and acquiring a depth map aligned with the RGB image.

Step 208 a: and extracting the depth data of the rectangular frame area from the depth map.

Step 210 a: and scanning lines of the rectangular frame area, counting the occurrence frequency of each depth data in each line, and taking the depth data with the highest occurrence frequency as a scanning result.

Step 212 a: invalid data in the scanning result is filtered.

Step 214 a: and scanning the filtered depth data again, counting the frequency of each depth data in the scanning, and sequencing according to the frequency.

Step 216 a: the aspect ratio of the rectangular box area is calculated.

Step 218 a: and judging whether the calculated aspect ratio meets the normal aspect ratio. If so, step 220a is performed, otherwise, step 222a is performed.

Step 220 a: and taking the depth data of the highest frequency as the distance between the interactive person and the robot.

Step 222 a: and taking the depth data of the second highest frequency as the distance between the interactive person and the robot.

Referring to fig. 2b, taking the row-column scanning of the rectangular frame area as an example, the method may include the following steps:

step 202 b: and acquiring an RGB image of the interactive person.

Step 204 b: and determining a rectangular frame area of the interactive person in the RGB image based on a human body detection algorithm.

Step 206 b: and acquiring a depth map aligned with the RGB image.

Step 208 b: and extracting the depth data of the rectangular frame area from the depth map.

Step 210 b: and performing line scanning on the rectangular frame area, counting the occurrence frequency of each depth data in each line, and taking the depth data with the highest occurrence frequency as a scanning result.

Step 212 b: invalid data in the scanning result is filtered.

Step 214 b: and scanning the filtered depth data again, counting the frequency of each depth data in the scanning, and sequencing according to the frequency.

Step 216 b: the aspect ratio of the rectangular box area is calculated.

Step 218 b: and judging whether the calculated aspect ratio meets the normal aspect ratio. If so, step 220b is performed, otherwise, step 222b is performed.

Step 220 b: and taking the depth data of the highest frequency as the distance between the interactive person and the robot.

Step 222 b: and taking the depth data of the second highest frequency as the distance between the interactive person and the robot.

It should be understood that in the embodiments of the present specification, the order in which the methods are performed is not in accordance with the sequence numbers of the steps.

In the embodiment of the specification, a human body detection algorithm is adopted to determine a rectangular frame area where an interactive target is located, then depth data of the rectangular frame area is extracted from a depth map, then analysis complexity is reduced according to a line scanning mode or a column scanning mode, the depth data obtained by scanning statistics are filtered, further, the distance between the interactive target and a robot is accurately determined according to the occurrence frequency of the depth data obtained after filtering, and the method can be suitable for multiple poses and scenes while the distance is collected and determined in verification.

Example two

Referring to fig. 3, a schematic structural diagram of an apparatus 300 for determining a distance between an interaction target and a robot according to an embodiment of the present disclosure is provided, where the apparatus 300 may include the following modules:

an obtaining module 302, configured to obtain an image of an interaction target in a shooting scene and a depth map aligned with the image;

an extracting module 304, configured to extract depth data of a rectangular frame region where the interaction target is located based on the image and the depth map;

a counting module 306, configured to perform row or column scanning on the rectangular frame region based on a preset first tolerance, and count depth data of a highest frequency in each row or each column;

a filtering module 308; the depth data processing module is used for filtering the depth data obtained by statistics after the depth data corresponding to the highest frequency in each row or column is counted by the statistics module 306;

the counting module 306 is further configured to scan the filtered depth data based on a preset second tolerance, and count the depth data with the highest frequency and the next highest frequency;

a judging module 310, configured to judge whether a size ratio of the rectangular frame region meets a preset human body size ratio;

a determining module 312, configured to determine, when the determination result of the determining module is yes, the depth data with the second highest frequency in the finally counted depth data as the distance between the interaction target and the robot; and the number of the first and second groups,

Optionally, in an implementation scheme, when the extraction module 304 extracts the depth data of the rectangular frame region where the interaction target is located based on the image and the depth map, specifically configured to:

determining a rectangular frame area corresponding to the interactive target from the image by using a human body detection algorithm; extracting depth data for the rectangular-box region from a depth map aligned with the image.

In an implementation of the present specification, the statistics module 306, when performing a row or column scan on the rectangular frame region based on a preset first tolerance, and performing statistics on the depth data of the highest frequency in each row or each column, may specifically be configured to:

performing line scanning or column scanning on the rectangular frame region based on a preset first tolerance, and determining the frequency of depth data in each line or each column; and counting the depth data corresponding to the highest frequency in each row or column.

In another implementation of the present disclosure, the statistical module 306, when performing line scanning or column scanning on the rectangular frame area in sequence based on a preset first tolerance, and determining the frequency of the depth data in each line or each column, is specifically configured to:

for each line of depth data, if the currently scanned depth data is the first depth data of the currently located line, counting one frequency for the depth data; if not, judging whether the difference between the currently scanned depth data and the depth data corresponding to the previous counting is smaller than a preset first tolerance or not, if so, accumulating a frequency for the depth data corresponding to the previous counting, and if not, counting a frequency for the depth data; until the current line is scanned; alternatively, the first and second electrodes may be,

for each column of depth data, counting one frequency for the depth data if the currently scanned depth data is the first depth data of the currently located column; if not, judging whether the difference between the currently scanned depth data and the depth data corresponding to the previous counting is smaller than a preset first tolerance or not, if so, accumulating a frequency for the depth data corresponding to the previous counting, and if not, counting a frequency for the depth data; until the current column is scanned.

In another implementation solution in this specification, when the filtering module 308 filters the depth data obtained by statistics, it is specifically configured to:

filtering out depth data with the ratio of the frequency to the number of columns in the depth data obtained by counting all rows smaller than a threshold value; or filtering out the depth data with the ratio of the frequency to the row number smaller than the threshold value in the depth data obtained by counting all the columns.

In another implementation of the present specification, the statistics module 306 is specifically configured to, when scanning the filtered depth data based on a preset second tolerance and counting the depth data with the highest frequency and the next highest frequency:

counting a frequency for the depth data if the currently scanned depth data is the first depth data in the filtered depth data; if the currently scanned depth data is not the first depth data in the filtered depth data, judging whether the difference between the currently scanned depth data and the depth data corresponding to the last counting is smaller than a preset second tolerance, if so, accumulating a frequency for the depth data corresponding to the last counting, and if not, counting a frequency for the depth data until the depth data obtained by counting is scanned; and (5) counting the depth data with the highest frequency and the next highest frequency after the scanning.

In yet another implementation of this specification, the apparatus 300 further comprises: a sorting module; the sorting module is used for sorting the depth data obtained by statistics based on the frequency after the depth data obtained by statistics is filtered by the statistics module.

In another implementation solution in this specification, when determining whether the size ratio of the rectangular frame region satisfies the preset human body size ratio, the determining module 310 is specifically configured to:

and judging whether the aspect ratio of the rectangular frame region meets a preset human body aspect ratio or not, or judging whether the aspect ratio of the rectangular frame region meets the preset human body aspect ratio or not.

In the embodiment of the specification, a human body detection algorithm is adopted to determine a rectangular frame area where an interactive target is located, then depth data of the rectangular frame area is extracted from a depth map, then analysis complexity is reduced according to a line scanning mode or a column scanning mode, the depth data obtained by scanning statistics are filtered, further, the distance between the interactive target and a robot is accurately determined according to the occurrence frequency of the depth data obtained by filtering, and the method can be applied to various poses and scenes while the distance is collected and determined in verification.

EXAMPLE III

The electronic apparatus of the embodiment of the present specification is described in detail below with reference to fig. 4. Referring to fig. 4, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a Non-Volatile Memory (Non-Volatile Memory), such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.

The processor, the network interface, and the memory may be interconnected by an internal bus, which may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 4, but that does not indicate only one bus or one type of bus.

And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.

The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form a device for determining the distance between the interaction target and the robot on a logic level. And the processor executes the program stored in the memory and is particularly used for executing the method operation executed when the device for determining the distance between the interaction target and the robot is taken as an execution main body.

The method disclosed in the embodiments of fig. 1-2 b in the present specification can be applied to a processor or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in a hardware decoding processor, or in a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.

The electronic device may further perform the method shown in fig. 1-2 b, and implement the function of the apparatus for determining the distance between the interaction target and the robot in the embodiment shown in fig. 1-2 b, which is not described herein again in this specification.

Of course, besides the software implementation, the electronic device of the embodiment of the present disclosure does not exclude other implementations, such as a logic device or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or a logic device.

Example four

The present specification embodiments also provide a computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform operations comprising:

filtering the depth data obtained by statistics;

scanning the filtered depth data based on a preset second tolerance, and counting the depth data with the highest frequency and the next highest frequency;

if so, determining the second highest depth data in the finally filtered depth data as the distance between the interactive target and the robot;

otherwise, determining the depth data with the second highest frequency in the finally filtered depth data as the distance between the interactive target and the robot.

The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

In short, the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the embodiments of the present disclosure should be included in the protection scope of the embodiments of the present disclosure.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The embodiments in the present specification are all described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Claims

1. A method of determining a distance between an interaction target and a robot, the method comprising:

filtering the depth data obtained by statistics;

2. The method of claim 1, wherein extracting depth data of a rectangular frame region where the interaction target is located based on the image and the depth map specifically comprises:

determining a rectangular frame area corresponding to the interactive target from the image by using a human body detection algorithm;

extracting depth data for the rectangular-box region from a depth map aligned with the image.

3. The method of claim 1, wherein scanning the rectangular frame area in rows or columns based on a preset first tolerance, and counting the depth data of the highest frequency in each row or each column specifically comprises:

based on a preset first tolerance, performing row scanning or column scanning on the rectangular frame region in sequence, and determining the frequency of depth data in each row or each column;

and counting the depth data corresponding to the highest frequency in each row or column.

4. The method of claim 3, wherein the step of sequentially scanning the rectangular frame area in rows or columns based on a preset first tolerance to determine the frequency of the depth data in each row or each column comprises:

5. The method of claim 4, wherein filtering the statistically derived depth data comprises:

filtering out depth data with the ratio of the frequency to the number of columns in the depth data obtained by counting all rows smaller than a threshold value; alternatively, the first and second electrodes may be,

and filtering out the depth data of which the ratio of the frequency to the row number in the depth data obtained by counting all the columns is less than a threshold value.

6. The method of claim 5, wherein the filtered depth data is scanned based on a predetermined second tolerance, and the statistics of the most frequent and the second most frequent depth data includes:

7. The method of claim 1, wherein determining whether the size ratio of the rectangular frame region satisfies a preset human body size ratio specifically comprises:

judging whether the aspect ratio of the rectangular frame region meets the preset human body aspect ratio or not, or,

and judging whether the aspect ratio of the rectangular frame region meets the preset human body aspect ratio.

8. An apparatus for determining a distance between an interaction target and a robot, the apparatus comprising:

9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, the computer program executing on the processor the method according to any of claims 1-7.

10. A computer readable storage medium storing one or more programs which, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform the method of any of claims 1-7.