CN112146578A

CN112146578A - Scale ratio calculation method, device, equipment and storage medium

Info

Publication number: CN112146578A
Application number: CN201910572929.4A
Authority: CN
Inventors: 李元伟; 陈紫荣
Original assignee: SF Technology Co Ltd; Shenzhen SF Taisen Holding Group Co Ltd
Current assignee: SF Technology Co Ltd; Shenzhen SF Taisen Holding Group Co Ltd
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2020-12-29
Anticipated expiration: 2039-06-28
Also published as: CN112146578B

Abstract

The embodiment of the application discloses a method, a device, equipment and a storage medium for calculating a scale ratio, wherein a plurality of frames of object images are acquired, and first acceleration corresponding to each moment of the object images is acquired; performing key point identification on the object in the multiple frames of object images, and calculating a second acceleration based on the key points of the multiple frames of object images and the time of acquiring the object images; and matching the first acceleration and the second acceleration based on time, and calculating a scale ratio according to the matched first acceleration and second acceleration. The accuracy of calculating the scale ratio is improved.

Description

Scale ratio calculation method, device, equipment and storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for calculating a scale ratio.

Background

Currently, the logistics industry is in a great trend of shifting to intelligent precision, and how to obtain precise information more efficiently at lower cost has become an urgent need of the industry.

In the field of logistics, the weight and volume of goods are the most important billing data. For a long time, the logistics industry still uses the original tape measure method in volume measurement. Some current intelligent software solutions, but current intelligent software solution is in the measurement process, generally for directly with the goods length signal, width signal and the height signal of reflection of light, as the goods length signal, width signal and the height signal of goods, can't calculate the error between the true side length of goods and the side length that the measurement obtained.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment and a storage medium for calculating a scale ratio, which can improve the accuracy of the calculation of the scale ratio.

In a first aspect, an embodiment of the present application provides a scale ratio calculation method, including:

acquiring a plurality of frames of object images and acquiring a first acceleration corresponding to each moment of the object images;

performing key point identification on the object in the multiple frames of object images, and calculating a second acceleration based on the key points of the multiple frames of object images and the time of acquiring the object images;

and matching the first acceleration and the second acceleration based on time, and calculating a scale ratio according to the matched first acceleration and second acceleration.

In some embodiments, the matching the first acceleration and the second acceleration based on time of day and calculating a scale ratio from the matched first acceleration and second acceleration includes:

acquiring the ratio of the acquisition frequencies of the shooting module and the inertia measurement module;

performing down-sampling processing on the first acceleration according to the ratio of the acquisition frequency to obtain a down-sampled first acceleration;

and matching the first acceleration and the second acceleration after the down sampling based on time, and calculating a scale ratio according to the first acceleration and the second acceleration after the matching.

In some embodiments, the matching the down-sampled first acceleration and the second acceleration based on time of day, and the calculating a scaling ratio from the matched first acceleration and the second acceleration comprises:

acquiring a first angular velocity generated by an inertial measurement module in a three-dimensional space;

performing visual tracking on the multiple frames of object images to obtain a second angular velocity;

matching the first acceleration and the second acceleration after the down sampling based on time to obtain the acceleration matched with the time;

and calculating a scale ratio according to the acceleration matched with the moment based on the first angular velocity and the second angular velocity.

In some embodiments, said time-matching the down-sampled first acceleration and the second acceleration based on time-matching, resulting in a time-matched acceleration, comprises:

extracting each time corresponding to each first acceleration in the first accelerations after the down sampling when the first accelerations are collected and each time recorded when the shooting module shoots the multi-frame object images;

and matching the time to obtain the scale ratio of each time through the time stamp of each time corresponding to each first acceleration in the first accelerations after down-sampling and the time stamp of each time when the shooting module shoots the multi-frame object image.

In some embodiments, the calculating a second acceleration based on the key point of the plurality of frames of object images and the time of acquiring the object images includes:

acquiring shooting moments of two adjacent frames of object images;

calculating the shooting time difference value of the two adjacent frames of object images and the movement value of the object according to the shooting time of the two adjacent frames of object images;

calculating the speed difference of the object in two adjacent frames of object images according to the movement value and the shooting time difference;

and calculating a second acceleration according to the speed difference of the object in the two adjacent frames of object images and the shooting time difference of the two adjacent frames of object images.

In some embodiments, after acquiring the plurality of frames of object images, the method further includes:

converting a multi-frame object image into a gray image to obtain a multi-frame gray image;

performing convolution operation on each frame of gray image and a preset Laplace kernel to obtain a multi-frame response image;

and calculating the variance of each frame of response image, and screening out the object images corresponding to the variance which is greater than or equal to a preset threshold value to obtain the screened multi-frame object images.

In some embodiments, after the calculating the variance of each response image and screening out the object images corresponding to the variance greater than or equal to the preset threshold to obtain the screened multiple frames of object images, the method further includes:

acquiring a preset gray value;

calculating the gray value of each pixel point of the screened multi-frame object image;

subtracting the gray value of each pixel point of the screened multi-frame object image from a preset gray value to obtain a gray difference value of each pixel point of the multi-frame object image;

taking the gray difference value of each pixel point of the multi-frame object image as the gray value of each pixel point of the multi-frame object image to obtain a processed multi-frame object image;

the identifying key points of the objects in the multiple frames of object images comprises:

and identifying key points of the objects in each frame of object image in the processed multiple frames of object images.

In a second aspect, an embodiment of the present application further provides a scale ratio calculation apparatus, including:

the acquisition unit is used for acquiring a plurality of frames of object images and acquiring a first acceleration corresponding to each moment of the object images;

the identification unit is used for identifying key points of objects in the multi-frame object images and calculating a second acceleration based on the key points of the multi-frame object images and the time for acquiring the object images;

and the calculating unit is used for matching the first acceleration and the second acceleration based on time and calculating a scale ratio according to the matched first acceleration and second acceleration.

In some embodiments, the computing unit includes:

the first acquisition subunit is used for acquiring a ratio of the acquisition frequencies of the shooting module and the inertia measurement module;

the down-sampling sub-unit is used for carrying out down-sampling processing on the first acceleration according to the ratio of the acquisition frequency to obtain the down-sampled first acceleration;

and the calculating subunit is used for matching the first acceleration and the second acceleration after the down sampling based on time, and calculating a scale ratio according to the first acceleration and the second acceleration after the matching.

In some embodiments, the calculation subunit includes:

the acquisition module is used for acquiring a first angular velocity generated by the inertia measurement module in a three-dimensional space;

the visual tracking module is used for carrying out visual tracking on the multiple frames of object images to obtain a second angular velocity;

the matching module is used for matching the first acceleration and the second acceleration after the down sampling based on time to obtain the acceleration matched with the time;

and the calculation module is used for calculating a scale ratio according to the acceleration matched with the moment based on the first angular velocity and the second angular velocity.

In some embodiments, the matching module comprises:

the extraction submodule is used for extracting each time corresponding to each first acceleration in the first accelerations after the down sampling when the first accelerations are collected and each time recorded when the shooting module shoots the multi-frame object images;

and the matching submodule is used for matching time to obtain the scale ratio of each time through the time stamp of each time corresponding to each first acceleration in the first accelerations after down-sampling and the time stamp of each time when the shooting module shoots the multi-frame object image.

In some embodiments, the identification unit includes:

the second acquisition subunit is used for acquiring the shooting time of two adjacent frames of object images;

the first calculating subunit is used for calculating the shooting time difference value of the two adjacent frames of object images and the movement value of the object according to the shooting time of the two adjacent frames of object images; calculating the speed difference of the object in two adjacent frames of object images according to the movement value and the shooting time difference; and calculating a second acceleration according to the speed difference of the object in the two adjacent frame object images and the shooting time difference of the two adjacent frame object images.

In some embodiments, the obtaining unit includes:

the conversion subunit is used for converting the multi-frame object image into a gray image to obtain a multi-frame gray image;

the convolution subunit is used for performing convolution operation on each frame of gray image and a preset Laplace kernel to obtain a multi-frame response image;

and the second calculating subunit is used for calculating the variance of each frame of response image, screening out the object images corresponding to the variance which is greater than or equal to the preset threshold value, and obtaining the screened multi-frame object images.

In some embodiments, the obtaining unit further includes:

the third acquisition subunit is used for acquiring a preset gray value;

the third calculation subunit is used for calculating the gray value of each pixel point of the screened multi-frame object image;

the subtraction subunit is configured to subtract the gray value of each pixel point of the screened multi-frame object image from a preset gray value to obtain a gray difference value of each pixel point of the multi-frame object image; taking the gray difference value of each pixel point of the multi-frame object image as the gray value of each pixel point of the multi-frame object image to obtain a processed multi-frame object image;

the identification unit includes:

and the identification subunit is used for identifying key points of the objects in each frame of object image in the processed multiple frames of object images.

In a third aspect, an embodiment of the present application further provides an apparatus, which includes a processor and a memory, where the memory stores program codes, and the processor executes the scale ratio calculation method as described above when calling the program codes in the memory.

In a fourth aspect, an embodiment of the present application further provides a storage medium, where the storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to perform the scale ratio calculation method provided in the embodiment of the present application.

The method comprises the steps of acquiring multiple frames of object images and acquiring first acceleration corresponding to each moment of the object images; performing key point identification on the object in the multiple frames of object images, and calculating a second acceleration based on the key points of the multiple frames of object images and the time of acquiring the object images; and matching the first acceleration and the second acceleration based on time, and calculating a scale ratio according to the matched first acceleration and second acceleration. According to the scheme, the second acceleration is calculated through multiple frames of object images obtained by continuous shooting through the shooting module in the terminal, and the scale ratio can be obtained by comparing the calculated second acceleration with the first acceleration acquired by the inertia measurement module, so that the accuracy of calculating the scale ratio is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic flow chart diagram of a scale ratio calculation method provided in an embodiment of the present application;

FIG. 2 is another schematic flow chart diagram of a scale ratio calculation method provided in an embodiment of the present application;

FIG. 3 is a schematic structural diagram of a scale ratio calculation apparatus provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of an apparatus provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any inventive work, are within the scope of protection of the present application.

Referring to fig. 1, fig. 1 is a schematic flow chart of a scale ratio calculation method according to an embodiment of the present application. The execution subject of the scale ratio calculation method may be the scale ratio calculation apparatus provided in the embodiment of the present application, or a device, such as a terminal or a server, into which the scale ratio calculation apparatus is integrated, where the scale ratio calculation method may be implemented in a hardware or software manner, and the device may be a smart phone, a tablet computer, a palm computer, or a notebook computer, a stationary computer, a server, and so on, on which a camera head and an IMU (Inertial measurement unit) are installed. The scale ratio calculation method may include:

s101, acquiring multiple frames of object images and acquiring first acceleration corresponding to each moment of the object images.

In particular, the embodiment can be applied to the calculation of the scale ratio in the process of calculating the volume of goods, boxes or cabinets. When the method is used for calculating the scale ratio of the box in the volume calculation process, a shooting module in the terminal can be used for acquiring a plurality of frames of object images including the box, and the shooting module can be a camera or a camera preset by the terminal. And acquiring the acceleration by an Inertial Measurement Unit (IMU) in the terminal to obtain a first acceleration, or acquiring the acceleration by an acceleration sensor preset in the terminal to obtain the first acceleration. Further, in order to improve the accuracy of calculating the second acceleration of the multiple frames of images, the multiple frames of object images including the box body can be obtained by shooting a box body video through a shooting module in the terminal and then analyzing the box body video. The method comprises the steps of acquiring a first acceleration acquired by an inertia measurement module in a terminal, wherein the inertia measurement unit is a device for measuring the three-axis attitude angle (or angular rate) and the acceleration of an object, and therefore the first acceleration when a terminal shooting module shoots can be directly acquired from an IMU of the terminal. After the first acceleration acquired by the IMU is acquired, the acquired acceleration data can be processed through a low-pass filter and redundant data is filtered out as the acceleration data in the three-dimensional orthogonal direction of the IMU coordinate system at each moment corresponding to the shooting by the shooting module is mainly acquired.

Further, step S101 may include:

acquiring a video shot by a shooting module in the terminal;

converting the analog signals in the video into image signals in a compressed state;

and decompressing the image signal in the compressed state to obtain a multi-frame object image of the video.

In the embodiment of the invention, in order to improve the accuracy of calculating the second acceleration of the box body in the images of a plurality of frames of objects, when the objects are shot, shooting conditions can be set in the terminal. For example, after the terminal starts to shoot an object or a scene at a preset angle, the shooting angle changes little during shooting, and when the terminal speed is stable, the accuracy of subsequently calculating the second acceleration of the image is higher, so that the shooting angle and the moving speed of the terminal during video shooting can be set. In the shooting process, if the shooting angle is detected to be inconsistent with the set shooting angle or the terminal moving speed is detected to be inconsistent with the set moving speed, prompt information can be sent to the user to prompt the user to adjust.

Since the continuous multi-frame images need to be analyzed so as to obtain the second acceleration of the object, the analog video signal is adopted for shooting during video shooting. And after the video shot by the shooting module in the terminal is obtained, converting the analog signal in the video into an image signal in a compressed state from an electric signal. In the scheme of the embodiment of the invention, a method for realizing conversion from the analog signal of the video to the image signal can be selected according to the actual situation, and the specific conversion method and algorithm are not limited. Through the conversion operation of the step, the image signal of the video can be obtained, but the image signal is in a compressed state, and the image signal in the compressed state needs to be further decompressed, so that the multi-frame object image of the video can be obtained, and the specific decompression method and algorithm are not limited.

After obtaining multiple frames of object images, in order to improve the calculation accuracy of the scale ratio, the collected object images may be screened in advance to screen out object images with higher definition for processing. Specifically, after step S101, the method includes:

First, a plurality of object images are converted from three primary color (RGB) images into grayscale images, and a plurality of grayscale images are obtained. Then, Laplacian Transform (Laplacian Transform) is performed on each gray image, for example, each gray image may be convolved with a preset Laplacian kernel to obtain a plurality of response images, where the preset Laplacian kernel may be flexibly set according to actual needs, for example, the preset Laplacian kernel may be a 3 × 3 Laplacian kernel.

At this time, the variance of each response image may be calculated, and an object image corresponding to the variance greater than or equal to a preset threshold is screened out to obtain an effective image, where the preset threshold may be flexibly set according to actual needs, and if the variance is greater than or equal to the preset threshold, the object image is clearer, and if the variance is smaller than the preset threshold, the object image is blurry, so that the blurred object image with the variance smaller than the preset threshold needs to be removed. And obtaining the screened multi-frame object image.

Further, after obtaining the multiple frames of object images after the screening, in order to further save the calculation amount, the processing may be performed on the multiple frames of object images after the screening, and then the key point identification is performed, specifically, after obtaining the multiple frames of object images after the screening, the method further includes:

acquiring a preset gray value;

First, a preset gray value is obtained, where the preset gray value may be 128, and a specific value may also be set according to an actual image size, which is not limited herein. And then calculating the gray value of each pixel point of the multi-frame object image, subtracting the gray value of each pixel point of the multi-frame object image from the preset gray value to obtain the gray difference value of each pixel point of the multi-frame object image, taking the gray difference value of each pixel point of the multi-frame object image as the gray value of each pixel point of the multi-frame object image to obtain the processed multi-frame object image, and reducing the calculated amount of the image and improving the efficiency of key point identification when each frame of object image in the processed multi-frame object image is subjected to key point identification of the object.

And S102, performing key point identification on the object in the multiple frames of object images, and calculating a second acceleration based on the key points of the multiple frames of object images and the time for acquiring the object images.

And extracting key points of the object contained in each frame of the obtained multi-frame object images to obtain the key points corresponding to each frame of the object images. Specifically, key points of objects can be extracted from each frame of object image through a trained network, such as a target detection network and an attitude estimation network, wherein the target detection network and the attitude estimation network are obtained by training sample images of the object key points labeled in sequence; carrying out convolution operation on the object image through a target detection network to obtain a characteristic diagram; and extracting object key points based on the feature map through the attitude estimation network.

The training of the target detection network and the posture estimation network can be carried out in the terminal, or the terminal directly obtains the target detection network and the posture estimation network after the training is carried out in other equipment. When training is carried out in the terminal, the terminal can acquire a plurality of object sample images, the two-dimensional coordinate positions of the object vertexes can be marked in the object sample images, and the two-dimensional coordinate positions of the object vertexes are real two-dimensional coordinate positions. And then, calculating the two-dimensional coordinate position of the object vertex in each object sample image by using a Tiny-DSOD network and a CPM network to obtain a predicted two-dimensional coordinate position. At this time, the real two-dimensional coordinate position and the predicted two-dimensional coordinate position can be converged, and the error between the real value and the predicted value can be reduced by adjusting the parameters of the Tiny-DSOD network and the CPM network to appropriate values, so that the trained Tiny-DSOD network and the trained CPM network can be obtained. Wherein, the Tiny-DSOD network and the CPM network can be connected in series to form a supervised learning network.

The terminal can process the screened multiple object images through the trained Tiny-digital optical network (Tiny-DSOD) and the trained CPM. For example, the object image is convolved by a Tiny-DSOD network to obtain a feature map, and for example, the object image is convolved by a 7-layer depth separable convolution of the Tiny-DSOD network to obtain the feature map. And then, extracting the two-dimensional coordinate position of the box vertex based on the feature map through the CPM network.

After the key points of the multiple frames of object images are obtained, the displacement value between every two adjacent frames of images can be calculated according to the corresponding key point of each frame of object image. Specifically, matching points between every two frame object images in the multi-frame object images are selected according to the key points, for example, matching points between an A object image and a B object image and pixel coordinates of the matching points are selected, wherein the pixel coordinates can be obtained by establishing a coordinate system in the object images. Then, calculating a numerical matrix of the matching points according to the matching points and the pixel coordinates; calculating the moving value of the matched point according to the numerical matrix of the matched point; and calculating a second acceleration according to the movement value and the difference value of the shooting time between every two adjacent frames of object images.

That is, step S102 may specifically include:

acquiring shooting moments of two adjacent frames of object images;

After the movement value is obtained, each time when the image is shot by the shooting device can be obtained, then the image and each shooting time are associated, namely, the time difference between two adjacent frames of images can be calculated according to each shooting time, and then the second acceleration of the image is calculated according to the time difference and the movement value. For example, assuming that the corresponding time of the image capturing module at the position S1 is T1, the corresponding time of the image capturing module at the position S2 is T2, and the corresponding time of the image capturing module at the position S3 is T3, a movement value L1 between the first and second capturing module positions S1 and S2 may be calculated, a movement value L2 between the second and third device positions S2 and S3 may be calculated, a time difference T1 may be calculated from the times T1 and T2, a time difference T2 may be calculated from the times T2 and T3, the rotation value may be 0, a movement speed V1 may be calculated from the movement value L1 and the time difference T1, a movement speed V2 may be calculated from the distance L2 and the time difference T2, and an acceleration rate T2 of the image capturing module at the time T2 may be calculated from the time difference T1, the time difference T2, the movement speed V1 and the movement speed V2.

S103, matching the first acceleration and the second acceleration based on time, and calculating a scale ratio according to the matched first acceleration and second acceleration.

Specifically, after the second acceleration is obtained, each time when the IMU acquires the first acceleration is further obtained, then each time corresponding to the second acceleration is matched with each time when the IMU acquires the first acceleration, and when the matched times are reached, the first acceleration and the second acceleration which are consistent in time are divided, so that the scale ratio of the current time can be obtained. Extracting each time corresponding to each first acceleration in the first accelerations during acquisition and each time when a shooting module shoots a plurality of frames of object images; the time stamps at all times corresponding to each first acceleration in the first accelerations are used for shooting the time stamps at all times when the module shoots the multi-frame object image, time matching is carried out, and the acceleration ratio at all times is obtained. The method comprises the steps of extracting all corresponding moments when each first acceleration in the first acceleration is collected, shooting all moments when a module shoots a multi-frame object image, matching the time stamp of all corresponding moments when each first acceleration in the first acceleration is collected with the time stamp of all moments when the module shoots the multi-frame object image, dividing the first acceleration and the second acceleration at the same moment to obtain the acceleration ratio, namely the scale ratio, at all moments.

In a specific implementation process, because the acquisition frequencies of the shooting module and the IMU are different, generally, the acquisition frequency of the shooting module is smaller than the acquisition frequency of the IMU, and therefore, in order to match a first acceleration acquired by the IMU with a second acceleration of the object image, the first acceleration can be subjected to down-sampling processing. Namely, step 103 includes:

and calculating an acceleration ratio according to the down-sampled first acceleration and the second acceleration.

Specifically, acquiring frequencies of the shooting module and the inertia measurement module are obtained, and comparing the acquiring frequencies of the shooting module and the inertia measurement module, so that a ratio of the acquiring frequencies of the shooting module and the inertia measurement module can be obtained. And then, carrying out down-sampling processing on the first acceleration according to the ratio of the acquisition frequency to obtain the down-sampled first acceleration. And dividing the first acceleration after the down sampling with the second acceleration to obtain the acceleration ratio.

Further, in order to improve the accuracy of the scale ratio calculation, in the process of calculating the scale ratio, the angular velocity may be further obtained, and the calculation of the acceleration may be determined according to the angular velocity, specifically, the step of matching the first acceleration and the second acceleration after the down-sampling based on time, and calculating the scale ratio according to the first acceleration and the second acceleration after the matching includes:

Specifically, the Inertial measurement module is an IMU (Inertial measurement unit), which is a device for measuring the three-axis attitude angle (or angular velocity) and acceleration of an object, so that the first angular velocity when the terminal shooting module shoots a box body can be directly obtained from the IMU of the terminal, then the multi-frame object image is visually tracked to obtain the second angular velocity, and particularly, the multi-frame object image is visually tracked in a parallel local object image maintenance and mapping mode through real-time matching and pose estimation of an ORB-SLAM algorithm to obtain the second angular velocity, wherein the ORB-SLAM algorithm real-time matching and pose estimation is a tracking thread, and the local object image maintenance and mapping is a local key frame thread.

It can be understood that, in the actual operation process, at the same time, the angle of the shooting module is consistent with the angle of the IMU of the terminal when shooting is performed, so after the first angular velocity and the second angular velocity are obtained, in order to ensure the accuracy of the data, the first angular velocity and the second angular velocity may be compared, and when the first angular velocity is determined to be consistent with the second angular velocity, the scale ratio is calculated according to the acceleration matched at the time.

Further, after it is determined that the first angular velocity is consistent with the second angular velocity, in the calculation process, a velocity ratio of the photographing module at each time in the photographing process to the time corresponding to the IMU needs to be calculated. Thus, in particular, the step "calculating an acceleration ratio from said down-sampled first acceleration and said predicted acceleration" comprises:

The method comprises the steps of extracting each time corresponding to each first acceleration in the first acceleration after down-sampling, and each time when a shooting module shoots multi-frame object images, time stamps of each time corresponding to each first acceleration in the first acceleration after down-sampling are collected, and time stamps of each time when the shooting module shoots the multi-frame object images, so that time matching is carried out, the first acceleration at the same time is divided by a predicted acceleration, and the acceleration ratio at each time can be obtained.

In the embodiment, a plurality of frames of object images are obtained, and a first acceleration corresponding to each moment of the object images is acquired; performing key point identification on the object in the multiple frames of object images, and calculating a second acceleration based on the key points of the multiple frames of object images and the time of acquiring the object images; and matching the first acceleration and the second acceleration based on time, and calculating a scale ratio according to the matched first acceleration and second acceleration. According to the scheme, the second acceleration is calculated through multiple frames of object images obtained by continuous shooting through the shooting module in the terminal, and the calculated second acceleration is compared with the first acceleration acquired by the inertia measurement module, so that the scale ratio can be obtained, and the accuracy of calculating the scale ratio is improved.

The scale ratio calculation method described in the above embodiment will be described in further detail below. In the embodiment, the method for calculating the volume of the box body is described by taking the terminal as a mobile phone and the object as the box body.

Referring to fig. 2, fig. 2 is a flowchart of a scale ratio calculation method according to an embodiment of the present application.

S201, acquiring a box video shot by a camera in the mobile phone.

In the embodiment of the invention, in order to improve the calculation of the second acceleration of the box in the plurality of frames of box images, when the box is shot, shooting conditions can be set in the mobile phone. For example, after the mobile phone starts to shoot the box or the box scene at the preset angle, the shooting angle changes little in the shooting process, and when the speed of the mobile phone is stable, the accuracy of subsequently calculating the second acceleration of the box image is higher, so that the shooting angle and the moving speed of the terminal during video shooting can be set. In the shooting process, if the shooting angle is detected to be inconsistent with the set shooting angle or the terminal moving speed is detected to be inconsistent with the set moving speed, prompt information can be sent to the user to prompt the user to adjust. Since continuous multi-frame images need to be analyzed so as to obtain the second acceleration of the box body, the analog video signal is adopted for shooting when the video is shot.

S202, converting the analog signals in the box video into image signals in a compressed state.

S203, decompressing the image signal in the compressed state to obtain a multi-frame box image of the video.

And after a box video shot by a camera in the mobile phone is obtained, converting an analog signal in the box video into an image signal in a compressed state from an electric signal. Through the conversion operation of the step, the image signal of the video can be obtained, but the image signal is in a compressed state, and the image signal in the compressed state needs to be further decompressed, so that the multi-frame box image of the video can be obtained.

And S204, converting the multi-frame box body image into a gray image to obtain a multi-frame gray image.

And S205, performing convolution operation on each frame of gray image and a preset Laplace kernel to obtain a multi-frame response image.

S206, calculating the variance of each frame of response image, and screening out the box body images corresponding to the variance which is greater than or equal to the preset threshold value to obtain the screened multi-frame box body images.

Firstly, a plurality of box images are converted into gray images from three primary color (RGB, Red Green Blue) images, and a plurality of gray images are obtained. Then, Laplacian Transform (Laplacian Transform) is performed on each gray image, for example, each gray image may be convolved with a preset Laplacian kernel to obtain a plurality of response images, where the preset Laplacian kernel may be flexibly set according to actual needs, for example, the preset Laplacian kernel may be a 3 × 3 Laplacian kernel.

At this time, the variance of each response image can be calculated, and the box images corresponding to the variance greater than or equal to the preset threshold are screened out to obtain effective images, wherein the preset threshold can be flexibly set according to actual needs, the box images are clearer if the variance is greater than or equal to the preset threshold, and the box images are blurry if the variance is less than the preset threshold, so that the blurred box images with the variance less than the preset threshold need to be removed to obtain the screened multi-frame box images.

And S207, acquiring a preset gray value.

And S208, calculating the gray value of each pixel point of the screened multi-frame box body image.

And S209, subtracting the gray value of each pixel point of the screened multi-frame box image from a preset gray value to obtain a gray difference value of each pixel point of the multi-frame box image.

S210, taking the gray difference value of each pixel point of the multi-frame box image as the gray value of each pixel point of the multi-frame box image, and obtaining the processed multi-frame box image.

First, a preset gray value is obtained, where the preset gray value may be 128, and a specific value may also be set according to an actual image size, which is not limited herein. And then calculating the gray value of each pixel point of the multi-frame box image, subtracting the gray value of each pixel point of the multi-frame box image from the preset gray value to obtain the gray difference value of each pixel point of the multi-frame box image, taking the gray difference value of each pixel point of the multi-frame box image as the gray value of each pixel point of the multi-frame box image to obtain the processed multi-frame box image, and reducing the calculated amount of the image and improving the efficiency of key point identification when the key point identification of the box is carried out on each frame of box image in the processed multi-frame box image.

S211, identifying key points of the box body in each frame of box body image in the processed multi-frame box body image, and calculating a second acceleration based on the key points of the multi-frame box body image and the time of acquiring the box body image.

And extracting key points of the box body contained in each frame of the obtained multi-frame box body images to obtain the key points corresponding to each frame of box body images. The method comprises the steps that a trained network extracts key points of a box body from each frame of box body image, such as a target detection network and an attitude estimation network, wherein the target detection network and the attitude estimation network are obtained by training based on sample images of box body key points marked in sequence; carrying out convolution operation on the box body image through a target detection network to obtain a characteristic diagram; and extracting key points of the box body based on the characteristic diagram through the attitude estimation network.

Specifically, matching points between every two frames of box images in the multi-frame box images are selected according to the key points, for example, matching points between the A box image and the B box image and pixel coordinates of the matching points are selected, wherein the pixel coordinates can be obtained by establishing a coordinate system in the box images. Then, calculating a numerical matrix of the matching points according to the matching points and the pixel coordinates; calculating the moving value of the matching point according to the numerical matrix of the matching point; and calculating a second acceleration according to the movement value and the difference value of the shooting time between every two adjacent frames of box body images.

S212, acquiring the ratio of the acquisition frequencies of the camera and the inertial measurement module.

S213, performing down-sampling processing on the first acceleration according to the ratio of the acquisition frequency to obtain the down-sampled first acceleration.

S214, matching the first acceleration and the second acceleration after the down sampling based on time, and calculating a scale ratio according to the first acceleration and the second acceleration after the matching.

In the specific implementation process, because the acquisition frequencies of the shooting module and the IMU are different, the acquisition frequency of the shooting module is lower than that of the IMU under the general condition, so that the first acceleration acquired by the IMU can be matched with the second acceleration of the box body image conveniently, and the first acceleration can be subjected to down-sampling processing. The specific treatment process comprises the following steps: acquiring the acquisition frequency of the camera and the acquisition frequency of the inertia measurement module, and comparing the acquisition frequency of the shooting module with the acquisition frequency of the inertia measurement module, so that the ratio of the acquisition frequency of the shooting module to the acquisition frequency of the inertia measurement module can be obtained. And then, carrying out down-sampling processing on the first acceleration according to the ratio of the acquisition frequency to obtain the down-sampled first acceleration. And dividing the first acceleration after the down sampling with the second acceleration to obtain an acceleration ratio.

According to the scheme, a second acceleration is obtained through calculation of multiple frames of box images obtained through continuous shooting by a camera in the mobile phone, and the second acceleration obtained through calculation is compared with a first acceleration obtained through acquisition of an inertia measurement module, so that a scale ratio can be obtained; thereby improving the accuracy of calculating the scale ratio.

In order to better implement the scale ratio calculation method provided by the embodiment of the present application, the embodiment of the present application further provides a scale ratio calculation device based on the above. The terms are the same as those in the above-mentioned scale ratio calculation method, and details of specific implementation may refer to the description in the method embodiment.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a scale ratio calculation apparatus according to an embodiment of the present disclosure, where the scale ratio calculation apparatus may include an obtaining unit 301, a recognition unit 302, a calculation unit 303, and the like.

Specifically, the scale ratio calculation means includes:

In some embodiments, the computing unit includes:

In some embodiments, the calculation subunit includes:

In some embodiments, the matching module comprises:

In some embodiments, the identification unit includes:

In some embodiments, the obtaining unit includes:

In some embodiments, the obtaining unit further includes:

the third acquisition subunit is used for acquiring a preset gray value;

the identification unit includes:

The specific implementation of the above operations can refer to the foregoing embodiments, and will not be described herein.

Fig. 4 shows a specific structural block diagram of an apparatus provided in an embodiment of the present invention, which may be used to implement the scale ratio calculation method provided in the above embodiment. The device 400 may be a smartphone or tablet computer, etc.

As shown in fig. 4, the apparatus 400 may include RF (Radio Frequency) circuit 110, memory 120 including one or more computer-readable storage media (only one shown), input unit 130, display unit 140, transmission module 170, processor 180 including one or more processing cores (only one shown), and power supply 190. Those skilled in the art will appreciate that the configuration of the apparatus 400 shown in fig. 4 does not constitute a limitation of the apparatus 400 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:

the RF circuit 110 is used for receiving and transmitting electromagnetic waves, and performs interconversion between the electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices. The RF circuitry 110 may include various existing circuit elements for performing these functions, such as antennas, radio frequency transceivers, digital signal processors, encryption/decryption chips, Subscriber Identity Module (SIM) cards, memory, and so forth. The RF circuitry 110 may communicate with various networks such as the internet, an intranet, a wireless network, or with other devices over a wireless network. The wireless network may comprise a cellular telephone network, a wireless local area network, or a metropolitan area network. The Wireless network may use various Communication standards, protocols and technologies, including but not limited to Global System for Mobile Communication (GSM), Enhanced Data GSM Environment (EDGE), Wideband Code Division Multiple Access (WCDMA), Code Division Multiple Access (CDMA), Time Division Multiple Access (TDMA), Wireless Fidelity (Wi-Fi) (e.g., IEEE802.11 a, IEEE802.11 b, IEEE802.1 g and/or IEEE802.11 n), Voice over Internet Protocol (VoIP), VoIP, world wide mail for Internet Access (wimax), and any other suitable protocols for instant messaging, including but not limited to short message Communication, and may even include those protocols that have not yet been developed.

The memory 120 may be used to store software programs and modules, such as the program instructions/modules of the scale ratio calculation method in the above-described embodiment, and the processor 180 executes various functional applications and data processing, i.e., functions for realizing the scale ratio calculation, by executing the software programs and modules stored in the memory 120. The memory 120 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 120 may further include memory located remotely from the processor 180, which may be connected to the device 400 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input unit 130 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, the input unit 130 may include a touch-sensitive surface 131 as well as other input devices 132. The touch-sensitive surface 131, also referred to as a touch display screen or a touch pad, may collect touch operations by a user on or near the touch-sensitive surface 131 (e.g., operations by a user on or near the touch-sensitive surface 131 using a finger, a stylus, or any other suitable object or attachment), and drive the corresponding connection device according to a predetermined program. Alternatively, the touch sensitive surface 131 may comprise both touch sensing means and touch controller portions. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 180, and receives and executes commands sent from the processor 180. In addition, the touch-sensitive surface 131 may be implemented in various types, such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch sensitive surface 131, the input unit 130 may also include other input devices 132. In particular, other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 140 may be used to display information input by or provided to a user and various graphical user interfaces of the device 400, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 140 may include a Display panel 141, and optionally, the Display panel 141 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like. Further, the touch-sensitive surface 131 may cover the display panel 141, and when a touch operation is detected on or near the touch-sensitive surface 131, the touch operation is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 provides a corresponding visual output on the display panel 141 according to the type of the touch event. Although in FIG. 4, touch-sensitive surface 131 and display panel 141 are shown as two separate components to implement input and output functions, in some embodiments, touch-sensitive surface 131 may be integrated with display panel 141 to implement input and output functions.

The device 400, via the transport module 170 (e.g., a Wi-Fi module), may assist the user in sending and receiving e-mail, browsing web pages, accessing streaming media, etc., which provides wireless broadband internet access to the user. Although fig. 4 shows the transmission module 170, it is understood that it does not belong to the essential constitution of the apparatus 400 and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 180 is the control center of the device 400, connects the various parts of the entire handset using various interfaces and lines, performs various functions of the device 400 and processes data by running or executing software programs and/or modules stored in the memory 120, and calling data stored in the memory 120, thereby monitoring the handset as a whole. Optionally, processor 180 may include one or more processing cores; in some embodiments, the processor 180 may integrate an application processor, which handles primarily the operating system, user interface, and applications, among others, and a modem processor, which handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 180.

The device 400 also includes a power supply 190 (e.g., a battery) for powering the various components, which may be logically coupled to the processor 180 via a power management system in some embodiments to manage charging, discharging, and power consumption management functions via the power management system. The power supply 190 may also include one or more of a dc or ac power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and any like components.

Specifically, in this embodiment, the display unit 140 of the apparatus 400 is a touch screen display, and the apparatus 400 further includes a memory 120, and one or more programs, wherein the one or more programs are stored in the memory 120, and the one or more programs configured to be executed by the one or more processors 180 include instructions for:

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and a part not described in detail in a certain embodiment may refer to the above detailed description of the scale ratio calculation method, and is not described herein again.

It will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by instructions or by instructions controlling associated hardware, and the instructions may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, embodiments of the present application provide a storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute any of the steps in the scale ratio calculation method provided in the embodiments of the present application. For example, the instructions may perform the steps of:

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Because the instructions stored in the storage medium can execute the steps in any of the scale ratio calculation methods provided in the embodiments of the present application, the beneficial effects that can be achieved by any of the scale ratio calculation methods provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described again here.

The above is a detailed description of a scale ratio calculation method, an apparatus, a device, and a storage medium provided in the embodiments of the present application, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and as described above, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A scale ratio calculation method, comprising:

2. The scale ratio calculation method according to claim 1, wherein the plurality of frames of object images are captured by a capture module, and the first acceleration is acquired by an inertial measurement module, and the matching the first acceleration and the second acceleration based on time and calculating the scale ratio from the matched first acceleration and second acceleration comprises:

3. The scaling ratio calculation method according to claim 2, wherein the matching the down-sampled first acceleration and the second acceleration based on time instants and calculating the scaling ratio from the matched first acceleration and second acceleration comprises:

4. The scaling ratio calculation method according to claim 3, wherein the time-matching based on time-matching the down-sampled first acceleration and the second acceleration, resulting in a time-matched acceleration, comprises:

5. The scale ratio calculation method according to any one of claims 1 to 4, wherein calculating the second acceleration based on the key points of the plurality of frames of object images and the time at which the object images are acquired comprises:

acquiring shooting moments of two adjacent frames of object images;

6. The scale ratio calculation method according to claim 1, wherein after acquiring the plurality of frames of object images, the method further comprises:

7. The method for calculating the scale ratio of claim 6, wherein the calculating the variance of each response image, and screening out the object images corresponding to the variance greater than or equal to a preset threshold value to obtain a plurality of screened object images further comprises:

acquiring a preset gray value;

8. A scale ratio calculation apparatus, comprising:

9. A scale ratio calculation apparatus comprising a processor and a memory, the memory having stored therein program code, the processor executing the scale ratio calculation method according to any one of claims 1 to 7 when calling the program code in the memory.

10. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the scaling ratio calculation method according to any one of claims 1 to 7.