CN117315756A - Image recognition method, device, electronic equipment and storage medium - Google Patents

Image recognition method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117315756A
CN117315756A CN202311316239.5A CN202311316239A CN117315756A CN 117315756 A CN117315756 A CN 117315756A CN 202311316239 A CN202311316239 A CN 202311316239A CN 117315756 A CN117315756 A CN 117315756A
Authority
CN
China
Prior art keywords
image
size information
information
entity
reference object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311316239.5A
Other languages
Chinese (zh)
Inventor
龙景盛
方忻建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN202311316239.5A priority Critical patent/CN117315756A/en
Publication of CN117315756A publication Critical patent/CN117315756A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image recognition method, an image recognition device, electronic equipment and a storage medium, and belongs to the technical field of image processing. The method comprises the following steps: identifying a first image to obtain the size information of each entity object in the first image, wherein the first image comprises at least two entity objects which contain reference objects; determining a first mapping ratio between the reference object and the first physical object based on first size information of the reference object in the first image and second size information of the first physical object in the first image, the size information of each physical object in the first image comprising: first size information and second size information; determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object; the first entity object is other entity objects except the reference object in the at least two entity objects.

Description

Image recognition method, device, electronic equipment and storage medium
Technical Field
The application belongs to the technical field of image processing, and particularly relates to an image identification method, an image identification device, electronic equipment and a storage medium.
Background
Currently, when an electronic device identifies an image, the electronic device can identify an object in the image through an image identification algorithm. In the related art, an electronic device may identify an object in an image according to image feature information of the object in the image. For example, taking a person as an example, the electronic device may extract feature information of a person in the image, such as feature information of eyes, nose, and mouth, and further determine whether an object in the image is an object required by the user according to similarity between the feature information and feature information pre-stored by the image recognition algorithm.
However, as can be seen from the above method, the main focus of the image recognition algorithm in the related art is on the facial features of the object, if the user needs to obtain the facial features of the object, the height of the user needs to be similar to the height of the image, an accurate recognition result can be obtained, while the result output by the image recognition algorithm in the related art only includes the similarity between the facial features of the object, and does not include the similarity between the body heights of the object, so that the electronic device cannot obtain the accurate output result; therefore, the applicability of the electronic device to obtain an accurate recognition result through the image recognition algorithm is poor, and thus the accuracy of the electronic device to recognize the image is poor.
Disclosure of Invention
An object of the embodiments of the present application is to provide an image recognition method, an image recognition device, an electronic device, and a storage medium, which can improve accuracy of image recognition performed by the electronic device.
In a first aspect, an embodiment of the present application provides an image recognition method, including: identifying a first image to obtain the size information of each entity object in the first image, wherein the first image comprises at least two entity objects which contain reference objects; determining a first mapping ratio between the reference object and the first physical object based on first size information of the reference object in the first image and second size information of the first physical object in the first image, the size information of each physical object in the first image comprising: first size information and second size information; determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object; the first entity object is other entity objects except the reference object in the at least two entity objects.
In a second aspect, an embodiment of the present application provides an image recognition apparatus, including: an identification module and a determination module. The identification module is used for identifying the first image to obtain the size information of each entity object in the first image, wherein the first image comprises at least two entity objects, and the at least two entity objects comprise reference objects. A determining module, configured to determine a first mapping ratio between a reference object and a first entity object based on first size information of the reference object in a first image and second size information of the first entity object in the first image, where the size information of each entity object in the first image includes: first size information and second size information; determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object; the first entity object is other entity objects except the reference object in the at least two entity objects.
In a third aspect, embodiments of the present application provide an electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the method as described in the first aspect.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program product stored in a storage medium, the program product being executable by at least one processor to implement the method according to the first aspect.
In the embodiment of the application, the electronic device may identify a first image, so as to obtain size information of each physical object in the first image, where the first image includes at least two physical objects, and the at least two physical objects include a reference object; determining a first mapping ratio between the reference object and the first physical object based on first size information of the reference object in the first image and second size information of the first physical object in the first image, the size information of each physical object in the first image comprising: first size information and second size information; determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object; the first entity object is other entity objects except the reference object in the at least two entity objects. In the scheme, the electronic equipment can obtain the fourth size information of the first entity object through the third size information of the reference object and the first mapping proportion between the reference object and the first entity object, so that in the process of image recognition of the electronic equipment, the electronic equipment is required to pay attention to whether the face of the recognized first entity object is accurate or not, and the size of the recognized first entity object is required to be ensured to be matched with the size in the first image, and therefore the electronic equipment can obtain the recognition result which is more fit with the requirements of a user through the two aspects, namely the face and the size, and further the accuracy of image recognition of the electronic equipment is improved.
Drawings
FIG. 1 is one of the flowcharts of an image recognition method provided in an embodiment of the present application;
FIG. 2 is a second flowchart of an image recognition method according to an embodiment of the present disclosure;
FIG. 3 is a third flowchart of an image recognition method according to an embodiment of the present disclosure;
FIG. 4 is a flowchart of a method for image recognition according to an embodiment of the present disclosure;
FIG. 5 is a fifth flowchart of an image recognition method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an image recognition device according to an embodiment of the present application;
fig. 7 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application;
fig. 8 is a second schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
Technical solutions in the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application are within the scope of the protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the objects identified by "first," "second," etc. are generally of a type and do not limit the number of objects, for example, the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The terms "at least one," "at least one," and the like in the description and in the claims of the present application mean that they encompass any one, any two, or a combination of two or more of the objects. For example, at least one of a, b, c (item) may represent: "a", "b", "c", "a and b", "a and c", "b and c" and "a, b and c", wherein a, b, c may be single or plural. Similarly, the term "at least two" means two or more, and the meaning of the expression is similar to the term "at least one".
The image recognition method, the device, the electronic equipment and the storage medium provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.
Currently, with the continuous development of computer vision and machine learning technologies, image recognition technology has become one of the important technologies in the field of artificial intelligence. In daily life, work, electronic devices often use images as input to match or identify the same type of image. In the prior art, a user may input a certain image into an image recognition application program through an electronic device, then the image recognition application program may extract all feature information in the certain image to determine an object, for example, a person from the all feature information, and then extract feature information of the person, for example, feature information of eyes, noses, and mouths, and further compare the feature information of the eyes, noses, and mouths with feature information stored in a database corresponding to the image recognition application program, so as to obtain a similarity between the feature information of the person in the certain image and the feature information stored in the database, and under the condition that the similarity is greater than or equal to a preset threshold, the electronic device may determine that the object is an identification object required by the user.
However, as can be seen from the above method, the main focus of the above image recognition algorithm is on the facial features of the object, that is, the electronic device only needs to determine whether the feature information of the face of the object matches the face feature information stored in the database, so as to determine the recognition object required by the user; if the user needs to obtain the similar appearance of the object and the similar height of the object, the electronic equipment cannot obtain an accurate recognition result through the image recognition algorithm, so that the accuracy of image recognition by the electronic equipment is poor.
In the image recognition method, the device, the electronic equipment and the storage medium provided by the embodiment of the application, because the electronic equipment can obtain the fourth size information of the first entity object through the third size information of the reference object and the mapping proportion between the reference object and the first entity object, in the process of image recognition of the electronic equipment, the electronic equipment is required to pay attention to whether the face of the recognized first entity object is accurate or not, and the size of the recognized first entity object is required to be ensured to be matched with the size in the first image, so that the electronic equipment can obtain the recognition result which is more fit with the requirements of users through the two aspects, namely the face and the size, and further the accuracy of image recognition of the electronic equipment is improved.
The execution subject of the image recognition method provided in the embodiment of the present application may be an image recognition device, and the image recognition device may be an electronic device or a functional module in the electronic device. The technical solution provided in the embodiments of the present application will be described below by taking an electronic device as an example.
An embodiment of the application provides an image recognition method, and fig. 1 shows a flowchart of the image recognition method provided by the embodiment of the application. As shown in fig. 1, the image recognition method provided in the embodiment of the present application may include the following steps 201 to 203.
Step 201, the electronic device identifies the first image, and obtains size information of each entity object in the first image.
In this embodiment of the present application, the first image includes at least two physical objects, where the at least two physical objects include a reference object.
Optionally, in the embodiment of the present application, the first image may be an image obtained by shooting by an electronic device; or, an image stored in the electronic device; or, the electronic device downloads the image through a third party application.
The third party application may include, for example, a browser application, an instant messaging type application, or a video type application.
It should be noted that, in the case that the third party application program is a video application program, the first image may be any frame of video frame intercepted by the electronic device in the video playing interface; in the case that the third party application is an instant messaging application, the first image may be an image downloaded by the electronic device in a session interface, that is, the image is a session message.
Optionally, in an embodiment of the present application, each of the at least two entity objects may be any one of the following: a person, animal or object.
Optionally, in an embodiment of the present application, an object type of each of the at least two entity objects is different or the same.
Illustratively, a first entity object of the at least two entity objects may be a person, and a second entity object of the at least two entity objects may be an animal or an object; alternatively, a first physical object of the at least two physical objects may be an animal, and a second physical object of the at least two physical objects may be a human or an object.
The physical object is a real person, animal or object photographed by the electronic device. Not objects contained in images synthesized by artificial intelligence.
Alternatively, in the embodiment of the present application, the number of the reference objects may be one or more.
Optionally, in an embodiment of the present application, the size information of each physical object in the first image may include a width and a height or a length and a height of each physical object in the first image.
Optionally, in an embodiment of the present application, the electronic device may identify the first image through a first algorithm, so as to obtain size information of each of the at least two physical objects in the first image.
The first algorithm may be an artificial intelligence (Artificial Intelligence, AI) algorithm or a neural network, for example.
For example, taking the first image including two physical objects as an example, the electronic device may identify the two physical objects in the first image through an AI algorithm, and then the electronic device may label the first physical object with a bounding box to obtain 4 first vertex coordinates corresponding to the bounding box of the first physical object, where the 4 first vertex coordinates are size information of the first physical object. Similarly, the electronic device may label the bounding box of the second physical object to obtain 4 second vertex coordinates corresponding to the bounding box of the second physical object, where the 4 second vertex coordinates are size information of the second physical object. The 4 first vertex coordinates include two first X-axis coordinates and two first Y-axis coordinates, a difference value between the two first X-axis coordinates is a length of the first physical object in the first image, and a difference value between the two first Y-axis coordinates is a width of the first physical object in the first image. Similarly, the 4 second vertex coordinates include two second X-axis coordinates and two second Y-axis coordinates, a difference between the two second X-axis coordinates is a length of the second physical object in the first image, and a difference between the two second Y-axis coordinates is a width of the second physical object in the first image.
Illustratively, the bounding box may be any polygon; preferably, the bounding box may be a rectangular box.
For example, the electronic device may obtain coordinate parameters of rectangular frames of respective physical objects, where the first physical object has a coordinate information of up and down (y 11, y 12), the second physical object has a coordinate information of up and down (y 21, y 22), and the third physical object has a coordinate information of up and down (x 21, x 22), and the first physical object has a size of (y 11-y 12) ×12-x11, and the second physical object has a size of (y 21-y 22) ×22-x 21.
Step 202, the electronic device determines a first mapping ratio between the reference object and the first entity object based on the first size information of the reference object in the first image and the second size information of the first entity object in the first image.
In this embodiment of the present application, the size information of each of the at least two physical objects in the first image includes: first size information and second size information.
In this embodiment of the present application, the first entity object is another entity object except the reference object in the at least two entity objects.
In this embodiment of the present application, after obtaining the first size information of the reference object in the first image and the second size information of the first entity object in the first image, the electronic device may obtain a first mapping ratio between the reference object and the first entity object through a quotient between the first size information and the second size information.
Step 203, the electronic device determines fourth size information of the first physical object based on the first mapping proportion and the third size information of the reference object.
Alternatively, in the embodiment of the present application, the third size information may be real size information of the reference object; alternatively, the third size information may be the first size information.
It should be noted that, the real size information of the reference object is a preset value, and the preset value is generally stored in a database.
Optionally, in an embodiment of the present application, the real size information of the reference object includes a real width and a real length or a real length and a real height of the first physical object.
The above-described real size information is illustratively a proportional value between the real width and the real length of the reference object.
In this embodiment of the present application, the electronic device may obtain fourth size information of the first physical object according to the mapping proportion and a proportion value between the real width and the real length of the reference object.
Optionally, in the embodiment of the present application, if the real size information of the reference object is not stored in the database, the electronic device may obtain a mapping ratio between the first image and the reference object and between the first entity object, so that the electronic device may also obtain the fourth size of the first entity object.
In the image recognition method provided by the embodiment of the application, the electronic device can recognize the first image to obtain the size information of each entity object in the first image, wherein the first image comprises at least two entity objects, and the at least two entity objects comprise reference objects; determining a first mapping ratio between the reference object and the first physical object based on first size information of the reference object in the first image and second size information of the first physical object in the first image, the size information of each physical object in the first image comprising: first size information and second size information; determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object; the first entity object is other entity objects except the reference object in the at least two entity objects. In the scheme, the electronic equipment can obtain the fourth size information of the first entity object through the third size information of the reference object and the mapping proportion between the reference object and the first entity object, so that in the process of image recognition of the electronic equipment, the electronic equipment is required to pay attention to whether the face of the recognized first entity object is accurate or not, and the size of the recognized first entity object is required to be ensured to be matched with the size in the first image, so that the electronic equipment can obtain the recognition result which is more fit with the requirements of a user through the two aspects, namely the face and the size, and further the accuracy of image recognition of the electronic equipment is improved.
Optionally, in an embodiment of the present application, as shown in fig. 2 in conjunction with fig. 1, before step 202 described above, the steps provided in the embodiment of the present application further include the following steps 301 and 302.
Step 301, the electronic device determines an image area where the first entity object is located in the first image.
In this embodiment of the present application, the electronic device may perform bounding box labeling on the first entity object, so as to determine an image area where the first entity object is located in the first image.
For example, after identifying the first entity object, the electronic device may perform boundary labeling on the first entity object through the interest box, so as to determine an image area where the first entity object is located in the first image.
Step 302, the electronic device uses vertex coordinate information of the image area as second size information of the first physical object.
Alternatively, in the embodiment of the present application, for the reference object, the electronic device may obtain the first size information of the reference object through the steps 301 and 302.
Optionally, in an embodiment of the present application, the first size information includes: first vertex coordinate information corresponding to the reference object is intercepted in the first image, and the second size information comprises: intercepting second vertex coordinate information corresponding to a first entity object in a first image, wherein the mapping proportion between a reference object and the first entity object comprises: an X-axis mapping ratio between the reference object and the first physical object and a Y-axis mapping ratio between the reference object and the first physical object.
In this embodiment of the present application, the electronic device may acquire X-axis vertex coordinate information in the first vertex coordinate information and Y-axis vertex coordinate information in the first vertex coordinate information, as the first size information of the reference object.
It will be understood that the number of X-axis first vertex coordinate information in the first vertex coordinate information is two, and the number of Y-axis first vertex coordinate information in the first vertex coordinate information is also two.
In this embodiment of the present application, after obtaining the 4 first vertex coordinate information of the bounding box corresponding to the reference object, the electronic device may directly extract, for the X axis, two X-axis first vertex coordinate information located on the X axis from the 4 first vertex coordinate information; for the Y-axis, the electronic device may directly extract two Y-axis first vertex coordinate information located on the Y-axis from the above 4 first vertex coordinate information.
In this embodiment of the present application, the electronic device may acquire the X-axis vertex coordinate information in the second vertex coordinate information and the Y-axis vertex coordinate information in the first vertex coordinate information, as the second size information of the first physical object.
It will be appreciated that the number of X-axis second vertex coordinate information in the second vertex coordinate information is two, and the number of Y-axis second vertex coordinate information in the second vertex coordinate information is also two.
In this embodiment of the present application, after obtaining the 4 pieces of second vertex coordinate information of the bounding box corresponding to the first entity object, the electronic device may directly extract, for the X axis, two pieces of X-axis second vertex coordinate information located on the X axis from the 4 pieces of second vertex coordinate information; for the Y-axis, the electronic device may directly extract two Y-axis second vertex coordinate information located on the Y-axis from the above 4 second vertex coordinate information.
In the embodiment of the application, the electronic device can determine the second size information of the first entity object based on the vertex coordinate information of the first entity object in the first image, so that the flexibility of the electronic device in determining the size information of the first entity object is improved.
Optionally, in this embodiment of the present application, the first size information includes N pieces of first vertex coordinate information, the first area vertex coordinate information includes first coordinate information corresponding to each coordinate axis under a first coordinate system, N is an integer greater than 2, and the first coordinate system is a coordinate system corresponding to the first image; the second size information includes M pieces of second vertex coordinate information, the second area vertex coordinate information includes second coordinate information corresponding to each coordinate axis in the first coordinate system, and M is an integer greater than 2.
Illustratively, in connection with FIG. 1, as shown in FIG. 3, the above-described step 202 may be implemented by the following steps 202a and 202 b.
Step 202a, the electronic device calculates a mapping ratio between the reference object and the first entity object under the first coordinate axis based on the first coordinate information corresponding to the first coordinate axis under the first coordinate system in each first vertex coordinate information and the second coordinate information corresponding to the first coordinate axis in each second vertex coordinate information.
In this embodiment of the present application, the electronic device determines an X-axis mapping ratio between the reference object and the first physical object based on the X-axis vertex coordinate information in the first vertex coordinate information and the X-axis vertex coordinate information in the second vertex coordinate information.
In this embodiment of the present application, the X-axis mapping ratio is used to indicate a width ratio between the reference object and the first physical object.
For example, the first size information of the reference object is (y 11-y 12) ×12-X11, the second size information of the first physical object is (y 21-y 22) ×22-X21), and the above X-axis mapping ratio may be implemented by the following formula 1, which is specifically expressed as follows:
(x 22-x 21)/(x 12-x 11) (equation 1)
Wherein X22 is the first coordinate of the first physical object on the X-axis, X21 is the second coordinate of the first physical object on the X-axis, X12 is the first coordinate of the reference object on the X-axis, and X11 is the second coordinate of the reference object on the X-axis.
In this embodiment of the present application, the electronic device determines a Y-axis mapping ratio between the reference object and the first physical object based on the Y-axis vertex coordinate information in the first vertex coordinate information and the Y-axis vertex coordinate information in the second vertex coordinate information.
In this embodiment of the present application, the Y-axis mapping ratio is used to indicate a length ratio between the reference object and the first physical object.
For example, the first size information of the reference object is (Y11-Y12) ×12-x11, the second size information of the first physical object is (Y21-Y22) ×22-x 21), and the above Y-axis mapping ratio may be implemented by the following formula 2, which is specifically expressed as follows:
(y 21-y 22)/(y 11-y 12) (equation 2)
Wherein Y21 is the first coordinate of the first physical object on the Y-axis, Y22 is the second coordinate of the first physical object on the Y-axis, Y11 is the first coordinate of the reference object on the Y-axis, and Y12 is the second coordinate of the reference object on the Y-axis.
Step 202b, the electronic device obtains a first mapping proportion based on the mapping proportion between the reference object and the first entity object under each coordinate axis under the first coordinate system.
In this embodiment of the present application, the first coordinate axis is one of the coordinate axes in the first coordinate system.
Illustratively, after obtaining the X-axis mapping ratio and the Y-axis mapping ratio, the electronic device may take a quotient between the X-axis mapping ratio and the Y-axis mapping ratio as the first mapping ratio, and may specifically be implemented by the following formula 3, where a specific formula is:
(y 21-y 22)/(y 11-y 12)/(x 22-x 21)/(x 12-x 11) (equation 3)
In the embodiment of the application, the electronic device can calculate the X-axis mapping proportion and the Y-axis mapping proportion between the reference object and the first entity object, so that the electronic device can obtain the real size information of the first entity object according to the X-axis mapping proportion and the Y-axis mapping proportion, and further the accuracy of the electronic device in obtaining the real size information of the first entity object is ensured.
Optionally, in an embodiment of the present application, the third size information of the reference object includes parameters of at least two first size elements of the reference object.
Illustratively, in connection with FIG. 1, as shown in FIG. 4, the above-described step 203 may be implemented by specifically the following step 203 a.
In step 203a, the electronic device determines fourth size information of the first size element of the first entity object based on the mapping ratio between the reference object and the first entity object in the first coordinate axis and the parameter of the first size element corresponding to the first coordinate axis.
In this embodiment of the present application, the first size element is one of at least two size elements, and the fourth size information of the first entity object includes real parameters of at least two first size elements of the first entity object; the dimension element includes at least one of: length, height, and width.
In the embodiment of the application, the electronic device determines the actual width of the first physical object based on the X-axis mapping proportion between the reference object and the first physical object and the width information corresponding to the reference object.
In the embodiment of the application, the electronic device may obtain the real width corresponding to the reference object from the database.
Illustratively, the actual width of the first physical object may be specifically implemented by the following formula 4, where the formula is specifically:
(x 22-x 21)/(x 12-x 11) b1 (formula 4)
Where (X22-X21)/(X12-X11) is the X-axis mapping ratio between the reference object and the first physical object, and a1 is the true width of the first physical object.
In the embodiment of the application, the electronic device determines the real length of the first entity object based on the Y-axis mapping proportion between the reference object and the first entity object and the length information corresponding to the first entity object.
Illustratively, the actual length of the first physical object may be specifically implemented by the following formula 5, where the formula is specifically:
(y 21-y 22)/(y 11-y 12) a1 (equation 5)
Where (x 22-x 21)/(x 12-x 11) is the Y-axis mapping ratio between the reference object and the first physical object, and a1 is the true length of the first physical object.
In the embodiment of the application, the electronic device can obtain the real length of the first entity object by referring to the Y-axis mapping proportion between the object and the first entity object, and obtain the real width of the first entity object by referring to the X-axis mapping proportion between the object and the first entity object, so that the electronic device can obtain the real size of the first entity object, can obtain the identification result which is more fit with the user requirement, and further improves the accuracy of image identification of the electronic device.
In the following, the real size information of the reference object is not stored in the database, and the electronic device can also obtain the real size of the first entity object for explanation.
The above step 203 may be specifically performed by the following steps 203c to 203e, for example. Realizing the method.
Step 203c, the electronic device obtains the resolution of the first image.
It will be appreciated that the resolution of the first image described above includes a length value and a width value. For example, the resolution of the first image may be 500px by 600px, where px is the pixel unit.
In this embodiment of the present application, after the electronic device acquires the first image, the electronic device may acquire the resolution of the first image from the attribute information of the first image.
Step 203d, the electronic device calculates a first mapping ratio between the first size information of the reference object in the first image and the resolution of the first image and a second mapping ratio between the first physical object and the reference object, respectively.
Optionally, in an embodiment of the present application, the first mapping ratio includes a mapping ratio between an X axis of the reference object and an X axis of the resolution, and a mapping ratio between a Y axis of the reference object and a Y axis of the resolution.
In this embodiment of the present application, based on the foregoing embodiment, it may be obtained that the first size information of the reference object in the first image is (y 11-y 12) × (x 12-x 11), and then the first mapping ratio between the first size information of the reference object in the first image and the resolution of the first image may be specifically implemented by the following formula 6 and formula 7, where the formula specifically is:
(y 11-y 12)/m (equation 6)
Wherein Y11 is a first coordinate point of the reference object on the Y axis, Y12 is a second coordinate point of the reference object on the Y axis, and m is a Y axis length corresponding to the resolution of the first image.
(x 12-x 11)/n (equation 7)
Wherein X12 is a first coordinate point of the reference object on the X axis, X11 is a second coordinate point of the reference object on the X axis, and n is an X axis length corresponding to the resolution of the first image.
It should be noted that, the second mapping ratio between the first entity object and the reference object may be described in detail in the above embodiment, and in order to avoid repetition, the description is omitted here.
Step 203e, the electronic device determines the real size information of the first entity object based on the second mapping ratio and the first mapping ratio between the first entity object and the reference object.
Optionally, in an embodiment of the present application, the real size information of the first physical object includes a real length and a real width.
Illustratively, the actual width of the first physical object may be specifically implemented by the following formula 8, where the formula is specifically:
(x 22-x 21)/(x 12-x 11) n (formula 8)
Wherein (X22-X21)/(X12-X11) is an X-axis mapping ratio between the reference object and the first physical object, and n is an X-axis length corresponding to the resolution of the first image.
Illustratively, the actual length of the first physical object may be specifically implemented by the following formula 9, where the formula is specifically:
(y 21-y 22)/(y 11-y 12) m (equation 9)
Wherein (x 22-x 21)/(x 12-x 11) is a Y-axis mapping ratio between the reference object and the first physical object, and m is a Y-axis length corresponding to the resolution of the first image.
In the embodiment of the application, the electronic device can obtain the real size of the first entity object through the resolution of the image, so that the flexibility of the electronic device in acquiring the first entity object is improved.
Optionally, in the embodiment of the present application, before the step 203, the image identifying method provided in the embodiment of the present application further includes a step 301 described below, and the step 203 may be specifically implemented by a step 203c described below.
Step 301, the electronic device determines a confidence level of the reference object and the first entity object in the first image based on text information of the reference object in the first image, object feature information of the reference object in the first image, a difference value between a pre-stored mapping proportion between the reference object and the first entity object and the first mapping proportion.
In this embodiment of the present application, the confidence is used to characterize whether the reference object and the first entity object in the first image are real objects.
In this embodiment of the present application, the text information of the reference object may be text in the first image and information attached to the first image, for example, description information of the reference object.
In an embodiment of the present application, the feature information of the reference object includes at least one of the following: facial information and limb information.
Illustratively, the above-mentioned face information may include at least one of: nose, eyes, ears, and lips.
For example, the limb information may be motion information of a reference object in the first image. Such as running, jumping and swimming.
It should be noted that, the mapping ratio of the first entity object and the reference object in the first image may be specifically obtained through the above embodiment, and in order to avoid repetition, details are not repeated here.
In this embodiment of the present application, the mapping ratio of the pre-stored first entity object to the reference object is a preset value pre-stored in a database.
Optionally, in the embodiment of the present application, after obtaining the text information of the reference object of the first image, the electronic device may score the text information, where the score is used to characterize the credibility of the text information belonging to the reference object.
For example, after acquiring the text information of the reference object, the electronic device may crawl other information associated with the text information, such as an image of the reference object, through crawler technology, and then score the text information to obtain the credibility that the text information belongs to the reference object.
Optionally, in the embodiment of the present application, after obtaining the feature information of the reference object, the electronic device may score the feature information of the reference object, where the score is used to characterize whether the reference object is a real object.
After the electronic device obtains the characteristic information of the reference object, for example, a face, the electronic device may calculate a distance ratio between key parts in the characteristic information of the face through an AI algorithm, and compare the distance ratios between different key parts of the common face called from the database, so as to obtain a score corresponding to the reference object.
Optionally, in the embodiment of the present application, after the mapping ratio of the first entity object and the reference object in the first image, the electronic device may score the mapping ratio of the first entity object and the reference object, where the score is used to characterize whether the mapping ratio of the first entity object and the reference object is normal.
For example, assuming that the first entity object is a pet dog and the reference object is a person, the electronic device may retrieve the length and width of the firewood dog from the database and the length and width of the person in the database, and calculate the length and width of the firewood dog retrieved from the database and the aspect ratio of the person in the database, and record as (f 11, f 12). The aspect ratio between the pet dog and the person in the first image is (f 21, f 22), and then the electronic device may score according to the size difference between the two.
The smaller the degree of difference, the higher the score.
In this embodiment of the present application, the confidence is used to determine whether the reference object and the first entity object in the first image are real objects.
In this embodiment of the present application, the electronic device may obtain the confidence coefficient of the first image according to the score corresponding to the text information of the reference object of the first image, the score corresponding to the feature information of the reference object, and the score corresponding to the mapping proportion difference value.
For example, the electronic device may obtain the confidence level of the first image through the following formula 10, which is specifically:
w1s1+w2s2+w3s3 (equation 10)
Wherein s1 is a score corresponding to text information of a reference object of the first image, s2 is a score corresponding to feature information of the reference object, s3 is a score corresponding to a mapping proportion difference value, w1 is a weight corresponding to text information of the reference object of the first image, w2 is a weight corresponding to feature information of the reference object, and w3 is a weight corresponding to the mapping proportion difference value.
It should be noted that, w1, w2 and w3 are all preset by the electronic device, and the weight values of the three weights are different.
In step 203c, the electronic device determines fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object if the confidence level is greater than or equal to the first threshold.
In this embodiment of the present application, when the confidence coefficient is greater than the first threshold, the electronic device considers that at least two objects in the first image are both entity objects, and may calculate the size, otherwise, the size cannot be calculated.
Illustratively, the value of the first threshold may be 80 or 90. Specifically, the method can be determined according to actual use requirements, and the embodiment of the application is not limited.
It should be noted that, the above "the electronic device determines the real size information of the first entity object based on the first mapping proportion and the real size information of the reference object" may be specifically described in the above embodiment, and in order to avoid repetition, details are not repeated here.
In the embodiment of the application, the electronic device can judge whether the portrait in the picture is a real person picture based on the image recognition algorithm, so that the influence of the proportional distortion of the non-real person is eliminated, and the accuracy of the electronic device in acquiring the size of the first entity object is further ensured.
Alternatively, in the embodiment of the present application, as shown in fig. 5 in conjunction with fig. 1, the above step 201 may be specifically implemented by the following step 201 a; after the step 203, the image recognition method provided in the embodiment of the present application further includes the following step 401.
Step 201a, the electronic device inputs a first image in an application interface of a first application to identify the first image.
In this embodiment of the present application, the first application may be any application with a search function in the electronic device.
The application interface may be an image recognition interface, for example.
Optionally, in the embodiment of the present application, before the electronic device inputs the first image in the application interface of the first application to identify the first image, the electronic device may start an image identification function of the first application in the setting application program.
Step 401, the electronic device retrieves the first entity object in the first application based on the fourth size information of the first entity object, and displays the retrieval result.
In the embodiment of the present application, when the electronic device searches in the shopping application, the electronic device needs to carry the fourth size information of the first entity object, so that the first entity object is searched in the first application, and the search result is displayed.
Optionally, in the embodiment of the present application, if the electronic device retrieves a retrieval result satisfying the fourth size information of the first entity object in the first application, the retrieval result is directly displayed.
Optionally, in the embodiment of the present application, if the electronic device does not retrieve a retrieval result satisfying the fourth size information of the first entity object in the first application, the foregoing embodiment calculates a retrieval result closest to the fourth size information of the first entity object in the current retrieval result, and displays the retrieval result closest to the fourth size information of the first entity object.
For example, the electronic device may sort the search results closest to the fourth size information of the first entity object according to the priority, and display the search results according to the priority; wherein the priority is determined by the size, that is, the closer the retrieval result is to the fourth size information of the first entity object, the more the display position is.
In the embodiment of the application, the electronic device can directly identify the first image in the first application, and further the electronic device can directly search the first entity object based on the identification result to obtain the search result, so that the efficiency of the electronic device in searching the first entity object is improved.
It should be noted that, in the image recognition method provided in the embodiment of the present application, the execution subject may be an image recognition apparatus, or an electronic device, or may also be a functional module or entity in the electronic device. In the embodiment of the present application, an image recognition device provided in the embodiment of the present application will be described by taking an example in which the image recognition device performs an image recognition method.
Fig. 6 shows a schematic diagram of one possible configuration of an image recognition apparatus involved in the embodiment of the present application. As shown in fig. 6, the image recognition apparatus 70 may include: an identification module 71 and a determination module 72.
The device comprises a first image, a second image, a recognition module and a display module, wherein the first image is used for recognizing the first image to obtain the size information of each entity object in the first image, the first image comprises at least two entity objects, and the at least two entity objects comprise reference objects. A determining module, configured to determine a first mapping ratio between a reference object and a first entity object based on first size information of the reference object in a first image and second size information of the first entity object in the first image, where the size information of each entity object in the first image includes: first size information and second size information; determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object; the first entity object is other entity objects except the reference object in the at least two entity objects.
In a possible implementation manner, the determining module is further configured to determine, before determining a first mapping ratio between the reference object and the first entity object, an image area where the first entity object is located in the first image, based on first size information of the reference object in the first image and second size information of the first entity object in the first image; and taking the vertex coordinate information of the image area as the size information of the first entity object.
In one possible implementation manner, the first size information includes N pieces of first vertex coordinate information, the first area vertex coordinate information includes first coordinate information corresponding to each coordinate axis in a first coordinate system, N is an integer greater than 2, and the first coordinate system is a coordinate system corresponding to the first image; the second size information includes M pieces of second vertex coordinate information, the second area vertex coordinate information includes second coordinate information corresponding to each coordinate axis in the first coordinate system, and M is an integer greater than 2. The determining module is specifically configured to calculate a mapping ratio between the reference object and the first entity object under the first coordinate axis based on first coordinate information corresponding to the first coordinate axis under the first coordinate system in each first vertex coordinate information and second coordinate information corresponding to the first coordinate axis in each second vertex coordinate information; and obtaining a first mapping proportion based on the mapping proportion between the reference object and the first entity object under each coordinate axis of the first coordinate system, wherein the first coordinate axis is one coordinate axis in the first coordinate system.
In one possible implementation manner, the third size information of the reference object includes parameters of at least two first size elements of the reference object; the determining module is specifically configured to determine fourth size information of the first size element of the first entity object based on a mapping proportion between the reference object and the first entity object under the first coordinate axis and a real parameter of the first size element corresponding to the first coordinate axis; wherein the first size element is one of at least two size elements, and the real size information of the first entity object includes real parameters of at least two first size elements of the first entity object; the dimension element includes at least one of: length, height, and width.
In one possible implementation manner, the determining module is further configured to determine, before determining the fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object, a confidence level of the reference object and the first entity object in the first image, where the confidence level is used to characterize whether the reference object and the first entity object in the first image are real objects, based on text information of the reference object in the first image, object feature information of the reference object in the first image, a difference value between a pre-stored mapping proportion between the reference object and the first entity object and the first mapping proportion; the determining module is specifically configured to determine fourth size information of the first entity object based on the first mapping proportion and third size information of the reference object when the confidence coefficient is greater than or equal to the first threshold.
In one possible implementation manner, the image recognition device further includes: and a processing module. The identification module is specifically configured to input a first image in an application interface of a first application, so as to identify the first image. The processing module is further configured to, after determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object, retrieve the first entity object in the first application based on the fourth size information of the first entity object, and display a retrieval result.
The embodiment of the application provides an image recognition device, because the image recognition device can obtain the fourth size information of the first entity object through the third size information of the reference object and the first mapping proportion between the reference object and the first entity object, in the process of performing image recognition by the image recognition device, the electronic equipment is required to pay attention to whether the face of the recognized first entity object is accurate or not, and the size of the recognized first entity object is required to be ensured to be matched with the size in the first image, so that the electronic equipment can obtain the recognition result which is more fit with the requirements of a user through the two aspects, namely the face and the size, and further the accuracy of performing image recognition by the image recognition device is improved.
The image recognition device in the embodiment of the application may be an electronic device, or may be a component in the electronic device, for example, an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. By way of example, the mobile electronic device may be a mobile phone, tablet, notebook, palmtop, vehicle-mounted electronic device, mobile internet appliance (Mobile Internet Device, MID), augmented reality (augmented reality, AR)/Virtual Reality (VR) device, robot, wearable device, ultra-mobile personal computer, UMPC, netbook or personal digital assistant (personal digital assistant, PDA), etc., but may also be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the present application are not limited in particular.
The image recognition device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system, an iOS operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
The image recognition device provided in the embodiment of the present application can implement each process implemented by the embodiment of the method, and in order to avoid repetition, details are not repeated here.
Optionally, as shown in fig. 7, the embodiment of the present application further provides an electronic device 90, including a processor 91 and a memory 92, where a program or an instruction capable of running on the processor 91 is stored in the memory 92, and the program or the instruction implements each step of the above-mentioned image recognition method embodiment when executed by the processor 91, and the steps can achieve the same technical effect, so that repetition is avoided, and no further description is given here.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device described above.
Fig. 8 is a schematic hardware structure of an electronic device implementing an embodiment of the present application.
The electronic device 100 includes, but is not limited to: radio frequency unit 101, network module 102, audio output unit 103, input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, and processor 110.
Those skilled in the art will appreciate that the electronic device 100 may further include a power source (e.g., a battery) for powering the various components, and that the power source may be logically coupled to the processor 110 via a power management system to perform functions such as managing charging, discharging, and power consumption via the power management system. The electronic device structure shown in fig. 8 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.
The processor 110 is configured to identify a first image, and obtain size information of each physical object in the first image, where the first image includes at least two physical objects, and the at least two physical objects include a reference object; determining a first mapping ratio between the reference object and the first physical object based on first size information of the reference object in the first image and second size information of the first physical object in the first image, the size information of each physical object in the first image comprising: first size information and second size information; determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object; the first entity object is other entity objects except the reference object in the at least two entity objects.
The embodiment of the application provides an electronic device, because the electronic device can obtain fourth size information of a first entity object through third size information of a reference object and a first mapping proportion between the reference object and the first entity object, in the process of image recognition of the electronic device, the electronic device is required to pay attention to whether the face of the recognized first entity object is accurate or not, and the size of the recognized first entity object is required to be ensured to be matched with the size in a first image, so that the electronic device can obtain a recognition result which is more fit with the requirements of a user through the two aspects, namely the face and the size, and further the accuracy of image recognition of the electronic device is improved.
Optionally, in this embodiment of the present application, the processor 110 is further configured to determine, before determining the first mapping ratio between the reference object and the first entity object, an image area where the first entity object is located in the first image, based on first size information of the reference object in the first image and second size information of the first entity object in the first image; and the vertex coordinate information of the image area is used as the second size information of the first entity object.
Optionally, in this embodiment of the present application, the first size information includes N pieces of first vertex coordinate information, the first area vertex coordinate information includes first coordinate information corresponding to each coordinate axis under a first coordinate system, N is an integer greater than 2, and the first coordinate system is a coordinate system corresponding to the first image; the second size information comprises M pieces of second vertex coordinate information, the second area vertex coordinate information comprises second coordinate information corresponding to each coordinate axis under the first coordinate system, and M is an integer greater than 2; the processor 110 is specifically configured to calculate a mapping ratio between the reference object and the first entity object under the first coordinate axis based on the first coordinate information corresponding to the first coordinate axis under the first coordinate system in each first vertex coordinate information and the second coordinate information corresponding to the first coordinate axis in each second vertex coordinate information; and obtaining a first mapping proportion based on the mapping proportion between the reference object and the first entity object under each coordinate axis of the first coordinate system, wherein the first coordinate axis is one coordinate axis in the first coordinate system.
Optionally, in an embodiment of the present application, the third size information of the reference object includes parameters of at least two first size elements of the reference object; the processor 110 is specifically configured to determine fourth size information of the first size element of the first physical object based on a mapping ratio between the reference object and the first physical object in the first coordinate axis and a parameter of the first size element corresponding to the first coordinate axis; wherein the first size element is one of at least two size elements, and the real size information of the first entity object includes parameters of at least two first size elements of the first entity object; the dimension element includes at least one of: length, height, and width.
Optionally, in this embodiment of the present application, before the determining the fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object, the processor 110 is further configured to determine a confidence level of the reference object and the first entity object in the first image, where the confidence level is used to characterize whether the reference object and the first entity object in the first image are real objects, based on text information of the reference object in the first image, object feature information of the reference object in the first image, a difference value between a pre-stored mapping proportion between the reference object and the first entity object, and the first mapping proportion; and determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object in the case that the confidence is greater than or equal to the first threshold.
Optionally, in an embodiment of the present application, the processor 110 is specifically configured to input a first image in an application interface of a first application, so as to identify the first image. The processor 110 is further configured to retrieve the first entity object in the first application based on the fourth size information of the first entity object, and display a retrieval result.
The electronic device provided in the embodiment of the present application can implement each process implemented by the above method embodiment, and can achieve the same technical effects, so that repetition is avoided, and details are not repeated here.
The beneficial effects of the various implementation manners in this embodiment may be specifically referred to the beneficial effects of the corresponding implementation manners in the foregoing method embodiment, and in order to avoid repetition, the description is omitted here.
It should be appreciated that in embodiments of the present application, the input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042, the graphics processor 1041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 107 includes at least one of a touch panel 1071 and other input devices 1072. The touch panel 1071 is also referred to as a touch screen. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.
Memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 109 may include volatile memory or nonvolatile memory, or the memory 109 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (ddr SDRAM), enhanced SDRAM (Enhanced SDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRRAM). Memory 109 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.
Processor 110 may include one or more processing units; optionally, the processor 110 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 110.
The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored, and when the program or the instruction is executed by a processor, the program or the instruction implement each process of the embodiment of the method, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled with the processor, and the processor is used for running a program or an instruction, implementing each process of the above method embodiment, and achieving the same technical effect, so as to avoid repetition, and not repeated here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
The embodiments of the present application provide a computer program product stored in a storage medium, where the program product is executed by at least one processor to implement the respective processes of the embodiments of the image recognition method described above, and achieve the same technical effects, and are not repeated herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.

Claims (14)

1. An image recognition method, the method comprising:
identifying a first image to obtain the size information of each entity object in the first image, wherein the first image comprises at least two entity objects, and the at least two entity objects comprise reference objects;
determining a first mapping ratio between the reference object and a first entity object based on first size information of the reference object in the first image and second size information of the first entity object in the first image, wherein the size information of each entity object in the first image comprises: the first size information and the second size information;
determining fourth size information of the first physical object based on the first mapping proportion and third size information of the reference object;
wherein the first entity object is other entity objects except the reference object in the at least two entity objects.
2. The method of claim 1, wherein prior to determining the first mapping scale between the reference object and the first physical object based on the first size information of the reference object in the first image and the second size information of the first physical object in the first image, the method further comprises:
Determining an image area where the first entity object is located in the first image;
and taking the vertex coordinate information of the image area as second size information of the first entity object.
3. The method of claim 2, wherein the first size information includes N pieces of first vertex coordinate information, the first vertex coordinate information including first coordinate information corresponding to each coordinate axis in a first coordinate system, N being an integer greater than 2, the first coordinate system being a coordinate system corresponding to the first image;
the second size information comprises M pieces of second vertex coordinate information, the second vertex coordinate information comprises second coordinate information corresponding to each coordinate axis in the first coordinate system, and M is an integer greater than 2;
the determining a first mapping ratio between the reference object and the first entity object based on the first size information of the reference object in the first image and the second size information of the first entity object in the first image includes:
calculating a mapping proportion between the reference object and the first entity object under the first coordinate axis based on first coordinate information corresponding to the first coordinate axis under the first coordinate system in each first vertex coordinate information and second coordinate information corresponding to the first coordinate axis in each second vertex coordinate information;
And obtaining the first mapping proportion based on the mapping proportion between the reference object and the first entity object under each coordinate axis of the first coordinate system, wherein the first coordinate axis is one coordinate axis in the first coordinate system.
4. A method according to claim 3, wherein the third size information of the reference object comprises parameters of at least two first size elements of the reference object;
determining fourth size information of the first physical object based on the first mapping scale and third size information of the reference object, including:
determining fourth size information of a first size element of the first entity object based on a mapping proportion between the reference object and the first entity object under the first coordinate axis and parameters of the first size element corresponding to the first coordinate axis;
wherein the first size element is one of the at least two size elements, the fourth size information of the first physical object includes parameters of the at least two second size elements of the first physical object, and the size elements include at least one of: length, height, and width.
5. The method of claim 1, wherein prior to determining fourth size information for the first physical object based on the first mapping scale and third size information for the reference object, the method further comprises:
determining a confidence level of the reference object and the first entity object in the first image based on text information of the reference object in the first image, object feature information of the reference object in the first image, and a difference value between a pre-stored mapping proportion and a first mapping proportion between the reference object and the first entity object, wherein the confidence level is used for representing whether the reference object and the first entity object in the first image are real objects;
the determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object includes:
and determining fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object under the condition that the confidence is larger than or equal to a first threshold value.
6. The method of claim 1, wherein the identifying the first image comprises:
Inputting the first image in an application interface of a first application to identify the first image;
after determining fourth size information of the first physical object based on the first mapping scale and third size information of the reference object, the method further comprises:
and based on the fourth size information of the first entity object, searching the first entity object in the first application, and displaying a search result.
7. An image recognition apparatus, the apparatus comprising: an identification module and a determination module;
the identification module is used for identifying a first image to obtain the size information of each entity object in the first image, wherein the first image comprises at least two entity objects, and the at least two entity objects comprise reference objects;
the determining module is configured to determine a first mapping ratio between the reference object and the first entity object based on first size information of the reference object in the first image and second size information of the first entity object in the first image, where the size information of each entity object in the first image includes: the first size information and the second size information; and determining fourth size information of the first physical object based on the first mapping proportion and third size information of the reference object; wherein the first entity object is other entity objects except the reference object in the at least two entity objects.
8. The apparatus of claim 7, wherein the determining module is further configured to determine an image region in the first image in which the first physical object is located before determining a first mapping ratio between the reference object and the first physical object based on first size information of the reference object in the first image and second size information of the first physical object in the first image; and taking the vertex coordinate information of the image area as second size information of the first entity object.
9. The apparatus of claim 8, wherein the first size information includes N pieces of first vertex coordinate information, the first area vertex coordinate information includes first coordinate information corresponding to each coordinate axis in the first coordinate system, N is an integer greater than 2, and the first coordinate system is a coordinate system corresponding to the first image; the second size information comprises M pieces of second vertex coordinate information, the second area vertex coordinate information comprises second coordinate information corresponding to each coordinate axis in the first coordinate system, and M is an integer greater than 2;
the determining module is specifically configured to calculate a mapping ratio between the reference object and the first entity object under the first coordinate axis based on first coordinate information corresponding to the first coordinate axis under the first coordinate system in each of the first vertex coordinate information and second coordinate information corresponding to the first coordinate axis in each of the second vertex coordinate information; and obtaining the first mapping proportion based on the mapping proportion between the reference object and the first entity object under each coordinate axis of the first coordinate system, wherein the first coordinate axis is one coordinate axis in the first coordinate system.
10. The apparatus of claim 9, wherein the third size information of the reference object includes parameters of at least two first size elements of the reference object;
the determining module is specifically configured to determine fourth size information of a first size element of the first entity object based on a mapping ratio between the reference object and the first entity object under the first coordinate axis and a real parameter of the first size element corresponding to the first coordinate axis;
wherein the first size element is one of the at least two size elements, and the fourth size information of the first physical object includes real parameters of the at least two first size elements of the first physical object; the dimension element includes at least one of: length, height, and width.
11. The apparatus of claim 7, wherein the determining module is further configured to determine, before the determining the fourth size information of the first physical object based on the first mapping ratio and the third size information of the reference object, a confidence level of the reference object and the first physical object in the first image based on text information of the reference object in the first image, object feature information of the reference object in the first image, a difference value between a pre-stored mapping ratio between the reference object and the first physical object and the first mapping ratio, the confidence level being used to characterize whether the reference object and the first physical object in the first image are real objects;
The determining module is specifically configured to determine fourth size information of the first entity object based on the first mapping proportion and third size information of the reference object when the confidence coefficient is greater than or equal to a first threshold.
12. The apparatus of claim 7, wherein the image recognition apparatus further comprises: a processing module;
the identification module is specifically configured to input the first image in an application interface of a first application, so as to identify the first image;
the processing module is further configured to, after the determining module determines fourth size information of the first entity object based on the first mapping proportion and the third size information of the reference object, retrieve the first entity object in the first application based on the fourth size information of the first entity object, and display a retrieval result.
13. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the image recognition method of any one of claims 1 to 6.
14. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the image recognition method according to any one of claims 1 to 6.
CN202311316239.5A 2023-10-11 2023-10-11 Image recognition method, device, electronic equipment and storage medium Pending CN117315756A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311316239.5A CN117315756A (en) 2023-10-11 2023-10-11 Image recognition method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311316239.5A CN117315756A (en) 2023-10-11 2023-10-11 Image recognition method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117315756A true CN117315756A (en) 2023-12-29

Family

ID=89236923

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311316239.5A Pending CN117315756A (en) 2023-10-11 2023-10-11 Image recognition method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117315756A (en)

Similar Documents

Publication Publication Date Title
US11917288B2 (en) Image processing method and apparatus
KR102285915B1 (en) Real-time 3d gesture recognition and tracking system for mobile devices
US12001479B2 (en) Video processing method, video searching method, terminal device, and computer-readable storage medium
CN111241340A (en) Video tag determination method, device, terminal and storage medium
CN113395542A (en) Video generation method and device based on artificial intelligence, computer equipment and medium
CN112036331A (en) Training method, device and equipment of living body detection model and storage medium
CN111429338B (en) Method, apparatus, device and computer readable storage medium for processing video
CN111080747B (en) Face image processing method and electronic equipment
CN112511743B (en) Video shooting method and device
CN112818733B (en) Information processing method, device, storage medium and terminal
CN111274476B (en) House source matching method, device, equipment and storage medium based on face recognition
CN112887615A (en) Shooting method and device
US20230199297A1 (en) Selectively using sensors for contextual data
CN116563588A (en) Image clustering method and device, electronic equipment and storage medium
CN113342157A (en) Eyeball tracking processing method and related device
CN113965550B (en) Intelligent interactive remote auxiliary video system
CN117315756A (en) Image recognition method, device, electronic equipment and storage medium
CN113298593A (en) Commodity recommendation and image detection method, commodity recommendation and image detection device, commodity recommendation and image detection equipment and storage medium
CN117097982B (en) Target detection method and system
CN115097903B (en) MR glasses control method and device, MR glasses and storage medium
Enrique III et al. Integrated Visual-Based ASL Captioning in Videoconferencing Using CNN
CN116887038A (en) Image generation method, device, electronic equipment and readable storage medium
CN117336579A (en) Shooting method, shooting device, electronic equipment and storage medium
CN117453635A (en) Image deletion method, device, electronic equipment and readable storage medium
CN117528179A (en) Video generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination