WO2021179498A1 - 目标检测方法及其模型的训练方法、装置及电子设备 - Google Patents
目标检测方法及其模型的训练方法、装置及电子设备 Download PDFInfo
- Publication number
- WO2021179498A1 WO2021179498A1 PCT/CN2020/100704 CN2020100704W WO2021179498A1 WO 2021179498 A1 WO2021179498 A1 WO 2021179498A1 CN 2020100704 W CN2020100704 W CN 2020100704W WO 2021179498 A1 WO2021179498 A1 WO 2021179498A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- area
- actual
- point
- target detection
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 416
- 238000012549 training Methods 0.000 title claims abstract description 76
- 238000000034 method Methods 0.000 title claims abstract description 59
- 238000004364 calculation method Methods 0.000 claims description 25
- 238000005070 sampling Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 14
- 238000002591 computed tomography Methods 0.000 description 10
- 238000004891 communication Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 230000001629 suppression Effects 0.000 description 3
- 206010018852 Haematoma Diseases 0.000 description 2
- 206010062767 Hypophysitis Diseases 0.000 description 2
- 206010061216 Infarction Diseases 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000002059 diagnostic imaging Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000007574 infarction Effects 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 210000000496 pancreas Anatomy 0.000 description 2
- 210000003635 pituitary gland Anatomy 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Definitions
- This application relates to the field of artificial intelligence technology, in particular to a target detection method and model training method, device and electronic equipment.
- the existing neural network models are generally based on anchor matching or anchor free strategies to achieve target detection.
- the existing strategies still have a high false detection rate in actual use.
- the embodiments of the present application provide a target detection method and a training method, device and electronic device of its model.
- the embodiment of the application provides a method for training a target detection model, including: obtaining a sample image, wherein the sample image is marked with actual position information of the actual area where the target is located; taking several points in the sample image as the detection points, based on each The distance between each detection point and the preset point of the actual area, at least one detection point is determined as the positive sample point of the target; the target detection model is used to perform target detection on the sample image, and the prediction area information corresponding to each positive sample point is determined ; using the actual position information and the predicted area information to determine the loss value of the target detection model; adjust the parameters of the target detection model based on the loss value of the target detection model.
- the sample image contains multiple targets; taking several points in the sample image as detection points, at least one detection point is determined as the positive sample point of the target based on the distance between each detection point and the preset point of the actual area , Including: down-sampling the sample image to obtain multiple feature maps corresponding to different resolutions; based on the size of the actual area of the target, grouping the actual area of multiple targets with multiple feature maps; among them, the larger the size The actual area and the feature map with the smaller resolution are regarded as the same group; for the feature map of the same group and the actual area of the target, each point in the feature map is determined as a detection point; based on each detection point and the actual area The distance between the preset points of at least one of the detection points is determined as the positive sample point of the target.
- m feature maps there are m feature maps; based on the size of the actual area of the target, the actual areas of multiple targets and multiple feature maps are grouped, including: calculating the area of the actual area of each target, and calculating the maximum and minimum of the area
- the range of values is divided into m intervals sorted from small to large; m feature maps are arranged from large to small in terms of resolution, and the actual area of the target whose area belongs to the i-th interval is divided from the i-th feature map To the same group; where i and m are positive integers, and i is a value between 0 and m.
- determining at least one detection point as the positive sample point of the target includes: obtaining the distance between each detection point and the preset point in the actual area ; At least one detection point whose distance from the preset point meets the preset condition is selected as the positive sample point of the target.
- determining at least one detection point whose distance from the preset point meets the preset condition as the positive sample point of the target includes: determining the first several detection points with the closest distance to the preset point as the target Positive sample point.
- the prediction region information includes the prediction position information of the prediction region corresponding to the positive sample point and the prediction confidence of the prediction region; using the actual position information and the prediction region information to determine the loss value of the target detection model includes: The actual location information and predicted location information of each target are used to obtain the location loss value; the predicted confidence level is used to obtain the confidence loss value; the loss value of the target detection model is determined based on the location loss value and the confidence loss value.
- the actual location information includes the actual area size of the actual area
- the predicted location information includes the predicted area size of the predicted area; using the actual location information and predicted location information of each target to obtain the location loss value, including: using the actual location of each target
- the area size and the predicted area size are used to obtain the area size loss value; based on the area size loss value, the location loss value is determined.
- the actual position information also includes the preset point position of the actual area;
- the predicted position information also includes the predicted offset information between the positive sample point of the predicted area and the preset point of the actual area;
- the actual position information of each target is used with Predicting the position information to obtain the position loss value also includes: calculating the actual offset information between the preset point position of the actual area of the target and the corresponding positive sample point position; using the actual offset information and the predicted offset information to obtain the offset Shift loss value; determine the position loss value based on the area size loss value, including: determine the position loss value based on the area size loss value and the offset loss value.
- the method further includes: taking the remaining detection points as negative sample points; using the target detection model Perform target detection on the sample image to obtain the prediction area information corresponding to each positive sample point, including: using the target detection model to perform target detection on the sample image to obtain the prediction area information corresponding to each positive sample point and each negative sample point corresponding
- the prediction area information; using the prediction confidence to obtain the confidence loss value includes: using the prediction confidence corresponding to the positive sample point and the prediction confidence corresponding to the negative sample point to obtain the confidence loss value.
- the sample image is a two-dimensional image or a three-dimensional image
- the actual area is the actual bounding box
- the predicted area is the predicted bounding box.
- setting the sample image as a two-dimensional image can achieve target detection on the two-dimensional image
- setting the sample image as a three-dimensional image can achieve target detection on the three-dimensional image
- the embodiment of the present application provides a target detection method, including: acquiring an image to be tested; using a target detection model to perform target detection on the image to be tested to obtain target area information corresponding to the target in the image to be tested; wherein the target detection model is Obtained by the training method of the target detection model in the above first aspect.
- the embodiment of the application provides a training device for a target detection model, including an image acquisition module, a sample selection module, a target detection module, a loss determination module, and a parameter adjustment module.
- the image acquisition module is configured to acquire sample images;
- the sample selection module is configured to Taking several points in the sample image as detection points, and based on the distance between each detection point and the preset point in the actual area, at least one detection point is determined as the positive sample point of the target;
- the target detection module is configured to use the target detection model Perform target detection on the sample image to obtain the predicted area information corresponding to each positive sample point;
- the loss determination module is configured to use the actual position information and the predicted area information to determine the loss value of the target detection model;
- the parameter adjustment module is configured to be based on the target detection model Adjust the parameters of the target detection model.
- the embodiment of the application provides a target detection device, which includes an image acquisition module and a target detection module.
- the image acquisition module is configured to acquire the image to be tested;
- the target detection module is configured to use the target detection model to perform target detection on the image to be tested to obtain The target area information corresponding to the target in the test image; wherein the target detection model is obtained by the above-mentioned target detection model training device.
- An embodiment of the present application provides an electronic device including a memory and a processor coupled to each other, and the processor is configured to execute program instructions stored in the memory to implement the above-mentioned target detection model training method or the above-mentioned target detection method.
- the embodiment of the present application provides a computer-readable storage medium, and the computer-readable storage medium stores program instructions.
- the program instructions are executed by a processor, the training method of the above-mentioned target detection model is realized, or the target detection method is realized.
- the embodiment of the present application provides a computer program, including computer readable code, when the computer readable code is executed in an electronic device, the processor in the electronic device is configured to achieve the goal of any one of the above The training method of the detection model, or the target detection method described above.
- the target detection method and its model training method, device and electronic equipment provided by the embodiments of the present application use several points in a sample image as detection points and are based on the distance between each detection point and a preset point in the actual area , At least one detection point is determined as the positive sample point of the target, thereby using the target monitoring model to perform target detection on the sample image, obtaining the predicted area information corresponding to each positive sample point, and using the actual actual area of the target in the sample image
- the predicted location information included in the location information and the predicted area information determines the loss value of the target detection model, and adjusts the parameters of the target detection model based on the loss value of the target detection model, which can be based on the multiple positive sample points corresponding to the matching Predict the location information for training the target detection model, so as to ensure the recall rate without designing the anchor frame.
- the parameters of the target detection model based on the loss value related to the location information the accuracy can be ensured, and then the target detection model can be adjusted. Improve the accuracy of target detection.
- FIG. 1 is a schematic diagram of a network architecture for training a target detection model and its application according to an embodiment of the application;
- FIG. 2 is a schematic flowchart of a method for training a target detection model provided by an embodiment of the application
- FIG. 3 is a schematic diagram of the implementation process of step S22 in the training method of the target detection model provided by the embodiment of the application;
- FIG. 4 is a schematic flowchart of a target detection method provided by an embodiment of the application.
- FIG. 5 is a schematic diagram of some predicted area information obtained by the target detection method provided by an embodiment of the application.
- FIG. 6 is a schematic flowchart of another target detection method provided by an embodiment of the application.
- FIG. 7 is a schematic structural diagram of a training device for a target detection model provided by an embodiment of the application.
- FIG. 8 is a schematic structural diagram of a target detection device provided by an embodiment of the application.
- FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
- FIG. 10 is a schematic structural diagram of a computer-readable storage medium provided by an embodiment of the application.
- system and "network” in this article are often used interchangeably in this article.
- the term “and/or” in this article is only an association relationship describing the associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations.
- the character "/” in this text generally indicates that the associated objects before and after are in an "or” relationship.
- "many” in this document means two or more than two.
- FIG. 1 is a schematic diagram of a network architecture provided by an embodiment of the application.
- the network architecture includes a CT machine 11, a computer device 12, and a server 13, where the CT machine 11 is used to collect original images.
- the CT machine 11 establishes a communication connection with the computer device 12, the CT machine 11 can send the obtained original image to the computer device 12, and the computer device 12 performs processing such as marking the original image to obtain a sample image.
- the server 13 stores the sample image
- the computer device 12 also establishes a communication connection with the server 13, and the computer device 12 can directly obtain the sample image from the server 13.
- the computer device 12 After the computer device 12 obtains the sample image, it adjusts the parameters of the target detection model based on the sample image.
- the computer device 12 receives the image to be tested, and the computer device 12 obtains target area information corresponding to the target in the image to be tested based on the target detection model.
- the server 13 after the server 13 obtains the sample image, it adjusts the parameters of the target detection model stored in the server 13 based on the sample image.
- the computer device 12 sends the image to be tested to the server 13 so that the server 13 can obtain target area information corresponding to the target in the image to be tested based on the target detection model. After obtaining the target area information, the server 13 returns the target area information to the computer device 12.
- the target detection method and its model training method As reference to the schematic diagram of the application scenario shown in FIG. 1, the following describes various embodiments of the target detection method and its model training method, device, and electronic equipment.
- An embodiment of the application provides a method for training a target detection model.
- the method is applied to a training device for a target detection model.
- the training device for the target detection model may be a computer device 12 as shown in FIG. 1.
- It may also be the server 13 in FIG. 1.
- the method provided in the embodiment of the present application may be implemented by a computer program, and when the computer program is executed, each step in the training method of the target detection model provided in the embodiment of the present application is completed.
- the computer program may be executed by the processor of the training device of the target detection model.
- FIG. 2 is a schematic flowchart of a method for training a target detection model provided by an embodiment of the application. As shown in FIG. 2, the method for training a target detection model may include the following steps:
- Step S21 Obtain a sample image.
- the sample image is marked with the actual position information of the actual area where the target is located.
- the actual area may be an actual bounding box (Bounding Box), for example, the actual bounding box of the target, and the actual bounding box may be a rectangular box, which is not limited here.
- the actual position information may include the position information of the preset point of the actual area (for example, the center point of the actual area) and the size of the actual area (for example, the size of the actual bounding box). Length and width).
- the sample image in order to implement target detection on a two-dimensional image, may be a two-dimensional image. In other implementation scenarios, in order to implement target detection on a three-dimensional image, the sample image may be a three-dimensional image, which is not limited here.
- the sample image in order to apply target detection to the field of medical imaging, may be a medical image, and the medical image may be a CT (Computed Tomography) image or an MR (Magnetic Resonance, nuclear magnetic resonance) image. , It is not limited here.
- the target in the sample image may be a biological organ, for example, pituitary gland, pancreas, etc.; or, the target in the sample image may also be a diseased tissue, etc., for example, luminal infarction, hematoma, etc. , It is not limited here.
- it can be deduced by analogy, so I won't give an example one by one here.
- Step S22 Taking several points in the sample image as detection points, at least one detection point is determined as the positive sample point of the target based on the distance between each detection point and the preset point of the actual area.
- the distance between each detection point and the preset point of the actual area can be obtained, so that the distance between each detection point and the preset point of the actual area can be At least one detection point whose distance between the points meets the preset condition is determined as the positive sample point of the target.
- At least part of the detection points whose distance from the preset point is less than a preset distance threshold can be selected as the positive sample point of the target, for example, at least part of the detection points whose distance from the preset point is less than 5 pixels , Or, at least part of the detection points whose distance from the preset point is less than 8 pixels is used as the positive sample point of the target, which is not limited here.
- the first detection points with the closest distance to the preset point may also be determined as the positive sample points of the target.
- the detection points can be the first 10 detection points, the first 20 detection points, the first 30 detection points, etc., which are not limited here.
- At least one detection point is determined as the positive sample point of the target, so that each actual area is matched to the same number of positive sample points, which can help ensure the gradient between targets of different sizes Balance, in turn, can help improve the accuracy of target detection.
- Step S23 Use the target detection model to perform target detection on the sample image to obtain prediction area information corresponding to each positive sample point.
- the prediction area information corresponding to each positive sample point includes the prediction position information of the prediction area corresponding to the positive sample point.
- the prediction area in order to clarify the range of the prediction area, the prediction area may be a prediction bounding box, and the prediction bounding box may be a rectangle, which is not limited here.
- the prediction area information in order to be able to uniquely represent a prediction bounding box, the prediction area information may include the location information of the preset point of the prediction area (for example, the center point of the prediction area) and the size of the prediction area (for example, the size of the prediction bounding box). Length and width).
- Step S24 Use the actual position information and the predicted area information to determine the loss value of the target detection model.
- the prediction area information may also include the prediction confidence of the prediction area, and the prediction confidence The degree can indicate the reliability of the prediction area. The higher the prediction confidence, the higher the reliability of the prediction area.
- the actual location information and predicted location information of each target are used to obtain the location loss value, and the prediction confidence is used , Get the confidence loss value, and get the loss value of the target detection model based on the position loss value and the confidence loss value.
- At least one of the binary cross-entropy loss function, the mean square error loss function, and the L 1 loss function may be used to calculate the loss value, which is not limited here.
- L 1 loss function also known as Least Absolute Deviation (LAD) or Least Absolute Error (LAE)
- LAD Least Absolute Deviation
- LAE Least Absolute Error
- m represents the number of positive sample points
- y (i) is the target value
- Is the estimated value
- L 1 is the loss function
- LSE Least Square Error
- m represents the number of positive sample points
- y (i) is the target value
- Is the estimated value
- L 2 is the loss function
- the actual position information may also include the actual area size of the actual area
- the predicted area information may also include the predicted area size of the predicted area.
- each area may also be used. The actual area size and the predicted area size of each target are obtained to obtain the area size loss value, and the location loss value is determined based on the area size loss value.
- the position loss weight corresponding to the position loss value and the confidence degree corresponding to the confidence loss value can be preset Loss weight, and use the position loss weight and the confidence loss weight to weight the position loss value and the confidence loss value to obtain the loss value of the target detection model.
- the actual position information may also include the preset point position of the actual area
- the predicted position may also include the predicted area.
- the predicted offset information between the positive sample point and the preset point of the actual area so that the actual offset information between the preset point position of the actual area of the target and the corresponding positive sample point position can be calculated, and the actual offset information can be used
- the offset information and the predicted offset information obtain the offset loss value, and then the position loss value can be determined based on the area size loss value and the offset loss value.
- the actual area size and the predicted area size of each target can be calculated using the IoU (Intersection over Union) loss function or the L 1 loss function to obtain the area size loss value, and use the L 1 loss function
- IoU is the ratio between the intersection and union between the actual area and the predicted area
- the L 1 loss function is used to calculate the length difference between the actual bounding boxes of the predicted bounding box, and/or the predicted bounding box and the actual For the width difference between the bounding boxes, please refer to the previous related steps.
- the position of the preset point (such as the center point) of the actual area is (38, 37.5)
- the category is human
- the position of a positive sample point is (37.5, 37.5)
- the size of the prediction area predicted by the detection model is 10*15
- the prediction offset information is (offset-x, offset-y)
- the confidence level for the category is 0.9
- the confidence level for the category is 0.2
- the target's confidence can be calculated
- the actual offset information between the preset point position of the actual area and the corresponding positive sample point position is (0.5, 0.1).
- the target is a small target
- the size of the corresponding actual area is 0.0.2*0.04
- the above The offset is larger than the size of the actual area, which leads to a large deviation in target detection. Therefore, the loss calculation and training of the offset can make the predicted offset close to or equal to the actual offset.
- the confidence loss value in order to further improve the accuracy of the confidence loss value and thereby improve the accuracy of target detection, it is also possible to use detection points other than positive sample points as negative sample points, and use the target detection model Perform target detection on the sample image to obtain the prediction area information corresponding to each positive sample point and the prediction area information corresponding to each negative sample point, and then use the prediction confidence corresponding to the positive sample point and the prediction confidence corresponding to the negative sample point, Obtain the confidence loss value.
- Step S25 Adjust the parameters of the target detection model based on the loss value of the target detection model.
- the parameters of the target detection model can be adjusted.
- the parameters of the target detection model may include, but are not limited to: the weight of the convolutional layer of the target detection model.
- the above-mentioned step S23 and subsequent steps may be executed again until the loss value meets the preset training end condition.
- the preset training end condition may include: the loss value of the target detection model is less than a preset loss threshold, and the loss value of the target detection model no longer decreases.
- the target monitoring model is used to perform target detection on the sample image, and the predicted area information corresponding to each positive sample point is obtained, and the actual position information of the actual area where the target is located in the sample image and the predicted area information include Predict location information, determine the loss value of the target detection model, adjust the parameters of the target detection model based on the loss value of the target detection model, and train the target detection model based on the predicted position information corresponding to multiple positive sample points obtained by the match Therefore, it is possible to ensure the recall rate without designing the anchor frame.
- the parameters of the target detection model based on the loss value related to the position information, the accuracy rate can be ensured, and the accuracy of the target detection can be improved.
- FIG. 3 is a schematic flowchart of step S22 in the method for training a target detection model provided by an embodiment of the application.
- the sample image may include multiple targets, and step S22 may be implemented through the following steps:
- Step S221 down-sampling the sample image to obtain multiple feature maps corresponding to different resolutions.
- Feature Pyramid Networks may be used to down-sample the sample image to obtain multiple feature maps corresponding to different resolutions.
- the above-mentioned FPN may be a part of the target detection model, so that by inputting the sample image into the target detection model, multiple feature maps corresponding to different resolutions can be obtained. Taking a sample image of 128*128 as an example, down-sampling it can get a feature map corresponding to 4*4 resolution, a feature map corresponding to 8*8, a feature map corresponding to 16*16, etc., which are not limited here. .
- each feature point in the 4*4 resolution feature map corresponds to the 32*32 pixel area of the sample image
- each feature point in the 8*8 resolution feature map corresponds to 16*16 pixels of the sample image Area
- each feature point in the 16*16 resolution feature map corresponds to the 8*8 pixel area of the sample image.
- the feature maps of other resolutions can be deduced by analogy, so we will not give examples one by one here.
- Step S222 Based on the size of the actual area of the target, group the actual areas of the multiple targets with multiple feature maps.
- the actual area with a larger size and the feature map with a smaller resolution are regarded as the same group.
- the actual area sizes of multiple targets in the sample image are 16*32, 11*22, 10*20, 5*10, so you can change the size to 16*32
- the actual area and the feature map with a resolution of 4*4 are divided into the same group, and the actual area with a size of 11*22 and the actual area with a size of 10*20 and the feature map with a resolution of 8*8 are divided into the same group.
- the actual area with a size of 5*10 and the feature map with a resolution of 16*16 are divided into the same group, which is not limited here.
- the area of the actual area of each target in order to accurately group the actual area of multiple targets with multiple feature maps, can also be calculated, and the range between the maximum value and the minimum value of the area can be calculated. Divided into m intervals sorted from small to large, where m is the number of feature maps, arrange the m feature maps in order of resolution from large to small, and the actual area of the target whose area belongs to the i-th interval is compared with The i-th feature map is divided into the same group, where i and m are positive integers, and i is a value between 0 and m.
- the number m of feature maps with different resolutions is 3, and the actual area sizes of multiple targets in the sample image are 16*32, 11*22, 10*20, and 5 respectively. *10, the areas are 512, 242, 200, and 50 respectively, and the maximum value 512 and the minimum value 50 are divided into 3 intervals, which are 50 ⁇ 204, 204 ⁇ 358, 358 ⁇ 512, respectively.
- the 4*4 resolution Feature maps, 8*8 feature maps, 16*16 feature maps are sorted in descending order of resolution: 16*16 resolution feature maps, 8*8 resolution feature maps, 4*4 resolution feature maps Feature map, the actual area of the target whose area belongs to the first interval (ie 50 ⁇ 204) is the actual area of 10*20 and the actual area of 5*10, so the two and the first feature map (ie, the resolution is The 16*16 feature map) is divided into the same group; the actual area of the target whose area belongs to the second interval (ie 204-358) is the actual area of 11*22, so it is combined with the second feature map (ie, the resolution 8*8 feature map) is divided into the same group; the actual area of the target whose area belongs to the third interval (i.e., 358-512) is the actual area of 16*32, so it is divided from the third feature map (that is, to distinguish The feature map with a rate of 4*4) is divided into the same group.
- Other sample images can be
- Step S223 For the feature maps of the same group and the actual area of the target, each point in the feature map is used as a detection point, and based on the distance between each detection point and the preset point of the actual area, at least one detection point is selected as A positive sample of the target.
- the position coordinates of the detection point in the sample image can be determined according to the position coordinates of the detection point in the feature map and the resolution of the feature map, so as to calculate the detection point and the position coordinates of the detection point in the sample image.
- the distance between the preset points in the actual area can be determined according to the position coordinates of the detection point in the feature map and the resolution of the feature map, so as to calculate the detection point and the position coordinates of the detection point in the sample image. The distance between the preset points in the actual area.
- each feature point in the 4*4 feature map is used as a detection point, because each feature of the feature map with a resolution of 4*4
- the point corresponds to 32*32 in the 128*128 sample image, so the detection point (1,1) corresponds to (16,16) in the sample image, and the detection point (1,2) corresponds to (16,48) in the sample image ), the detection point (1,3) corresponds to (16,80) in the sample image, and the detection point (1,4) corresponds to (16,112) in the sample image.
- the detection point (2,1) corresponds to the sample image.
- the detection point (2,2) corresponds to (48,48) in the sample image
- the detection point (2,3) corresponds to (48,80) in the sample image
- the detection point (2,4) ) Corresponds to (48,112) in the sample image.
- the Euclidean distance can be used to calculate the preset distance of the detection point from the actual area
- the distances of the points are: 16, 16, 48, 80, 35.78, 35.78, 57.69, 86.16.
- Other detection points can be deduced by analogy, so we will not give examples one by one here.
- the size of the actual area is the size of the target of 16*32
- the positive sample points can be the feature points (1,1), (1,2) and (2,1), (2,2) in the feature map with a resolution of 4*4. Other cases can be deduced by analogy. This will not give examples one by one.
- multiple feature maps corresponding to different resolutions are obtained, so that the actual area of the multiple targets and multiple feature maps are grouped based on the size of the actual area of the target, and The larger the size of the actual area and the smaller the resolution of the feature map are as the same grouping, so that for the feature map of the same group and the actual area of the target, each point of the feature map is the detection point, and the execution is based on each detection point and the actual area.
- each point in the feature map of each group can be used as the detection point for the selection of positive sample points, which can help ensure that as many as possible are generated. Positive sample points are in turn helpful to ensure the recall rate, which in turn helps to improve the accuracy of target detection.
- An embodiment of the application provides a target detection method.
- the method is applied to a target detection device.
- the target detection device may be a computer device.
- the method provided in the embodiment of the application may be implemented by a computer program. At that time, each step in the target detection method provided in the embodiment of the present application is completed.
- the computer program may be executed by the processor of the target detection device.
- FIG. 4 is a schematic flow chart of the target detection method provided in an embodiment of this application. As shown in FIG. 4, the target detection method may include the following steps:
- Step S41 Obtain an image to be tested.
- the image to be tested in order to implement target detection on a two-dimensional image, may be a two-dimensional image. In other implementation scenarios, in order to implement target detection on a three-dimensional image, the image to be tested may be a three-dimensional image, which is not limited here.
- the image to be tested may be a medical image, for example, a CT (Computed Tomography) image or an MR (Magnetic Resonance) image.
- CT Computer Tomography
- MR Magnetic Resonance
- the target in the image to be tested can be a biological organ, such as the pituitary gland, pancreas, etc.; or, the target in the image to be tested can also be a diseased tissue, such as luminal infarction, hematoma, etc., which is not done here. limited.
- it can be deduced by analogy, so I won't give examples one by one here.
- Step S42 Use the target detection model to perform target detection on the image to be tested, and obtain target area information corresponding to the target in the image to be tested.
- the target detection model is obtained through the steps in any of the above-mentioned training method embodiments of the target detection model. You can refer to the steps in any of the foregoing target detection model training method embodiments.
- the target detection model is used to perform target detection on the image to be tested to obtain a prediction corresponding to each detection point.
- Area information where the prediction area information corresponding to each detection point includes the prediction confidence level and prediction area location information of the prediction area corresponding to the detection point, and is based on the prediction confidence level and prediction area location information of the prediction area corresponding to each detection point , Adopting non-maximum suppression (Non-Maximum Suppression, NMS) to obtain target area information corresponding to the target in the image to be measured.
- FIG. 5 is a schematic diagram of several prediction area information obtained by the target detection method provided by an embodiment of the application.
- prediction area 01 to prediction area 05 are prediction areas corresponding to each detection point, and the detection results in prediction
- the prediction confidence of area 01 is 0.6
- the prediction confidence of prediction area 02 is 0.9
- the prediction confidence of prediction area 03 is 0.8
- the confidence of prediction area 04 is 0.9
- the confidence of prediction area 05 is 0.8.
- the regions are arranged in descending order of prediction confidence: prediction region 01, prediction region 03, prediction region 05, prediction region 02, prediction region 04, select the prediction region 04 with the highest prediction confidence, and use the prediction location information to determine the prediction regions respectively 01.
- Whether the IoU of prediction area 03, prediction area 05, prediction area 02, and prediction area 04 is greater than a preset intersection ratio threshold (for example, 60%), if yes, discard it.
- a preset intersection ratio threshold for example, 60%
- prediction area 05 and The intersection of prediction area 04 is relatively large, assuming it is 85%, prediction area 05 is discarded, and the intersection ratio of prediction area 01 to prediction area 03 to prediction area 04 is 0, so keep it.
- prediction area 04 is taken as For the target area corresponding to the target, select the prediction area 02 with the highest prediction confidence from the remaining prediction areas 01 to 03, and determine whether the IoU of the prediction area 01, the prediction area 03 and the prediction area 02 are based on the prediction position information It is greater than a preset intersection ratio threshold (for example, 60%). If yes, discard it.
- a preset intersection ratio threshold for example, 60%
- prediction area 01, prediction area 03, and prediction area 02 are 65% and 70%, respectively.
- predict area 01 and prediction area 03 is discarded, and the predicted area 02 is reserved as the target area corresponding to the target.
- Other situations can be deduced by analogy, so I won't give examples one by one here.
- NMS Non-Maximum Suppression
- the predicted position in the training process of the target detection model, in order to improve the accuracy of the target detection model, especially to improve the detection accuracy of small targets, the predicted position may also include the positive sample points of the predicted region and The predicted offset information between the preset points of the actual area, so that the actual offset information between the preset point position of the actual area of the target and the corresponding positive sample point position can be calculated, and the actual offset information and the predicted offset can be used
- the offset loss value can be obtained from the shift information, and then the position loss value can be obtained based on the area size loss value and the offset loss value.
- the position loss value can be used to adjust the parameters of the target detection model.
- the obtained target area information may also include the offset information (offset-x, offset-y) between the target area and the detection point (x0, y0), so the position of the target in the image to be measured can be expressed as (x0+offset -x, y0+offset-y), and determine the target category based on the detected category confidence. For example, the category confidence that the target is human is 0.9, and the category confidence that the target is cat is 0.1, so the detection can be determined The target is people.
- the target area information may also include the size (for example, length and width) of the target area.
- the target detection method provided in the embodiments of the present application can improve the accuracy of target detection by using the target detection model obtained by the target detection model training method in each of the foregoing embodiments to perform target detection on the image to be tested.
- an embodiment of the present application further provides a target detection method, the method includes:
- Step S61 Pass the acquired image to be tested through the FPN network to obtain feature maps of different resolutions.
- step S62 the feature maps of different resolutions are grouped.
- each gt box group according to the size of the gt box (the same as the area of the actual area in each of the above embodiments) and feature maps at different resolutions.
- the feature maps with higher resolution are responsible for detecting small targets, and the feature maps with lower resolution are responsible for detection.
- When calculating the loss function first sort each gt box according to the distance from the detection point to the center point of the gt box, and select the first k detection points as the positive sample points of the gt box, and the remaining points are the negative sample points of the gt box.
- Use the IoU loss to regress the height (H, High) and width (W, Width) of the box according to the corresponding positive sample, and use the L1 loss function to regress the offset of the corresponding positive sample point.
- Step S63 in the inference process based on the grouping, an NMS operation is used to remove duplicate detection boxes.
- the method provided in the embodiments of the present application has enough positive samples to ensure the recall rate. At the same time, since each gt box matches the same number of positive samples, the gradient between the targets of different sizes in the classification loss can be guaranteed. balance.
- the IOU loss is used to regress the H and W of the gt box, and the L1loss is used to calculate the offset value (offset) from the positive sample point to the actual gt box center point to obtain more accurate position information.
- FIG. 6 is a schematic diagram of the processing process of images to be tested in medical images based on an embodiment of this application. As shown in FIG. The feature maps of different resolutions are grouped to obtain each group 602, and based on each group 602, an NMS operation is used to remove duplicate detection frames, and an image 603 of the disease location is obtained. In this way, the detection accuracy is improved and false positives are reduced.
- FIG. 7 is a schematic structural diagram of a training device for a target detection model provided by an embodiment of the present application.
- the training device 70 for a target detection model includes: an image acquisition module 71, a sample selection module 72, a target detection module 73, and loss The determination module 74 and the parameter adjustment module 75.
- the image acquisition module 71 is configured to acquire a sample image, wherein the sample image is marked with the actual position information of the actual area where the target is located;
- the sample selection module 72 is configured to detect several points in the sample image Point, based on the distance between each detection point and the preset point of the actual area, at least one detection point is determined as the positive sample point of the target;
- the target detection module 73 is configured to use the target detection model to perform target detection on the sample image to obtain The prediction area information corresponding to each positive sample point;
- the loss determination module 74 is configured to use the actual position information and the prediction area information to determine the loss value of the target detection model;
- the parameter adjustment module 75 is configured to adjust the target based on the loss value of the target detection model Check the parameters of the model.
- the training device for the target detection model uses several points in the sample image as the detection points, and determines at least one detection point as the distance between each detection point and the preset point of the actual area
- the positive sample point of the target is used to detect the target in the sample image using the target monitoring model to obtain the predicted area information corresponding to each positive sample point, and use the actual position information of the actual area where the target is located in the sample image and the predicted area information.
- the target detection model can be determined based on the predicted position information corresponding to the multiple positive sample points obtained by the match. Training can ensure the recall rate without designing the anchor frame.
- the parameters of the target detection model based on the loss value related to the position information, the accuracy can be ensured, and the accuracy of the target detection can be improved.
- the sample image contains multiple targets
- the sample selection module 72 includes a down-sampling sub-module configured to down-sample the sample image to obtain multiple feature maps corresponding to different resolutions
- the sample selection module 72 also includes a grouping sub-module, configured to group the actual areas of multiple targets with multiple feature maps based on the size of the actual area of the target; among them, the actual area with a larger size and a feature map with a smaller resolution are regarded as the same
- the sample selection module 72 also includes a sample selection sub-module configured to determine each point in the feature map as a detection point for the feature map of the same group and the actual area of the target, based on the preset of each detection point and the actual area The distance between the points, the step of determining at least one detection point as the positive sample point of the target.
- the actual area of the multiple targets and multiple feature maps are grouped based on the size of the actual area of the target, and
- the larger the size of the actual area and the smaller the resolution of the feature map are as the same grouping, so that for the feature map of the same group and the actual area of the target, each point of the feature map is the detection point, and the execution is based on each detection point and the actual area.
- the distance between the preset points of the area, the step of selecting at least one detection point as the positive sample point of the target, on the one hand, the feature map with high resolution can be responsible for the small size target, and the feature map with low resolution can be responsible for the large size.
- each point of the feature map of each group can be used as the detection point to select positive sample points, which can help ensure that as many positive samples as possible are generated Points, which in turn helps to ensure the recall rate, which in turn helps to improve the accuracy of target detection.
- the grouping sub-module includes an interval division part, configured to calculate the area of the actual area of each target, and divide the range between the maximum value and the minimum value of the area into smaller ones.
- the grouping submodule includes the grouping division part, which is configured to arrange the m feature maps according to the resolution from large to small, and combine the actual area of the target with the area of the i-th interval and the i-th feature The graph is divided into the same group; where i and m are positive integers, and i is a value between 0 and m.
- the range between the maximum value and the minimum value of the area is divided into m intervals sorted from small to large, and m is the same as the number of feature maps, and Sort the m feature maps in descending order of resolution, and divide the actual area of the target whose area belongs to the i-th interval and the i-th feature map into the same group, so that the larger the actual area and the smaller the resolution
- the feature maps of as the same group which can help to achieve multi-scale target detection, and thus can help improve the accuracy of target detection.
- the sample selection module 72 further includes a distance calculation sub-module configured to obtain the distance between each detection point and a preset point in the actual area, and the sample selection module 72 also includes a distance judgment sub-module. It is configured to determine at least one detection point whose distance from the preset point meets the preset condition as the positive sample point of the target.
- the distance judgment sub-module is configured to use the first detection points with the closest distance to the preset point as the positive sample points of the target.
- each actual area can be matched to the same number of positive sample points, which can be beneficial Ensuring the gradient balance between targets of different sizes can help improve the accuracy of target detection.
- the prediction area information includes the prediction location information of the prediction area corresponding to the positive sample point and the prediction confidence of the prediction area
- the loss determination module 74 includes a location loss value calculation sub-module configured to The actual location information and predicted location information of each target are used to obtain the location loss value.
- the loss determination module 74 also includes a confidence loss value calculation sub-module configured to use the predicted confidence to obtain the confidence loss value.
- the loss determination module 74 also It includes a model loss value calculation sub-module, which is configured to determine the loss value of the target detection model based on the position loss value and the confidence loss value.
- the actual location information and predicted location information of each target are used to obtain the location loss value, and the predicted confidence level is used to obtain the confidence loss value, so as to obtain the target detection model based on the location loss value and the confidence loss value
- the loss value of can ensure the accuracy of the loss value calculation in the training process, which can help improve the accuracy of target detection.
- the actual location information includes the actual area size of the actual area
- the predicted location information includes the predicted area size of the predicted area
- the location loss value calculation sub-module includes an area size loss value calculation part configured to use each The actual area size of the target and the predicted area size are used to obtain the area size loss value.
- the position loss value calculation sub-module includes a position loss value calculation part, which is configured to determine the position loss value based on the area size loss value.
- the actual area size and predicted area size of each target are used to obtain the area size loss value, and based on the area size loss value, the position loss value is obtained, which can improve the accuracy of the loss value and further ensure the training process
- the accuracy of the calculation of the mid-loss value can in turn help improve the accuracy of target detection.
- the actual position information further includes the preset point position of the actual area; the predicted position information also includes the predicted offset information between the positive sample point of the predicted area and the preset point of the actual area, and the area size
- the loss value calculation part is also configured to calculate the actual offset information between the preset point position of the actual area of the target and the corresponding positive sample point position, and use the actual offset information and the predicted offset information to obtain the offset loss value
- the position loss value calculation part is further configured to determine the position loss value based on the area size loss value and the offset loss value.
- the sample selection module 72 further includes a negative sample selection sub-module configured to use the remaining detection points as negative sample points
- the target detection module 73 is configured to perform target detection on the sample image using the target detection model.
- the confidence loss value calculation sub-module is configured to use the prediction confidence corresponding to the positive sample point and the prediction confidence corresponding to the negative sample point , Get the confidence loss value.
- the prediction area information corresponding to each positive sample point and the prediction area information corresponding to each negative sample point are used to obtain the confidence loss value, which can help improve the accuracy of the confidence loss value. Conducive to improving the accuracy of target detection.
- the sample image is a two-dimensional image or a three-dimensional image
- the actual area is an actual bounding box
- the predicted area is a predicted bounding box
- setting the sample image as a two-dimensional image can achieve target detection on the two-dimensional image
- setting the sample image as a three-dimensional image can achieve target detection on the three-dimensional image
- FIG. 8 is a schematic structural diagram of a target detection device provided by an embodiment of the application.
- the target detection device 80 includes an image acquisition module 81 and a target detection module 82.
- the image acquisition module 81 is configured to acquire an image to be tested;
- the module 82 is configured to use the target detection model to perform target detection on the image to be tested, and obtain target area information corresponding to the target in the image to be tested; wherein the target detection model is the target in the training device embodiment through any of the target detection models described above Obtained by the training device that detects the model.
- the target detection device provided by the embodiment of the present application can improve the accuracy of target detection by using the target detection model obtained by the training device of the target detection model in the embodiment of the training device for the target detection model to perform target detection on the image to be tested. .
- FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
- the electronic device 90 includes a memory 91, a processor 92, and a communication bus 93 that are coupled to each other, and the processor 92 is configured to execute storage in the memory 91
- the program instructions to implement the steps of any of the foregoing target detection model training method embodiments, or implement the steps of any of the foregoing target detection method embodiments.
- the electronic device 90 may include but is not limited to: a microcomputer and a server.
- the electronic device 90 may also include mobile devices such as a notebook computer and a tablet computer, which are not limited herein.
- the processor 92 is configured to control itself and the memory 91 to implement the steps of any of the foregoing target detection model training method embodiments, or implement the steps of any of the foregoing target detection method embodiments.
- the communication bus 93 is configured to connect the memory 91 and the processor 92.
- the processor 92 may also be referred to as a CPU (Central Processing Unit, central processing unit).
- the processor 92 may be an integrated circuit chip with signal processing capability.
- the processor 92 may also be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (ASIC), a field-programmable gate array (Field-Programmable Gate Array, FPGA), or other Programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the processor 92 may be jointly implemented by an integrated circuit chip.
- the above solution can train the target detection model based on the predicted position information corresponding to the multiple positive sample points obtained by the matching, so that the recall rate can be ensured without the need to design the anchor frame.
- the The loss value adjusts the parameters of the target detection model, which can ensure accuracy, and thereby can improve the accuracy of target detection.
- FIG. 10 is a schematic structural diagram of a computer-readable storage medium provided by an embodiment of the application.
- the computer-readable storage medium 100 stores program instructions 101 that can be executed by a processor, and the program instructions 101 are configured to implement any of the foregoing.
- the above solution can train the target detection model based on the predicted position information corresponding to the multiple positive sample points obtained by the matching, so that the recall rate can be ensured without the need to design the anchor frame.
- the The loss value adjusts the parameters of the target detection model, which can ensure accuracy, and thereby can improve the accuracy of target detection.
- the disclosed method and device can be implemented in other ways.
- the device implementation described above is only illustrative, for example, the division of modules or units is only a logical function division, and there may be other divisions in actual implementation, for example, units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of this embodiment.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) or a processor execute all or part of the steps of the methods in the various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .
- the embodiment of the application discloses a target detection method and its model training method, device and electronic equipment.
- the training method of the target detection model includes: obtaining a sample image, wherein the sample image is marked with the actual area where the target is located. Position information; take several points in the sample image as detection points, and select at least one detection point as the positive sample point of the target based on the distance between each detection point and the preset point of the actual area; use the target detection model to compare the sample image Perform target detection to obtain prediction area information corresponding to each positive sample point, where the prediction area information corresponding to each positive sample point includes the predicted position information of the prediction area corresponding to the positive sample point; using actual position information and predicted area information, Determine the loss value of the target detection model; adjust the parameters of the target detection model based on the loss value of the target detection model.
- Target detection based on this model can improve the accuracy of target detection.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (25)
- 一种目标检测模型的训练方法,包括:获取样本图像,其中,所述样本图像标注有目标所在的实际区域的实际位置信息;以所述样本图像中的若干点为检测点,基于每个所述检测点与所述实际区域的预设点之间的距离,将至少一个所述检测点确定为所述目标的正样本点;利用目标检测模型对所述样本图像进行目标检测,确定每个所述正样本点对应的预测区域信息;利用所述实际位置信息与所述预测区域信息,确定所述目标检测模型的损失值;基于所述目标检测模型的损失值,调整所述目标检测模型的参数。
- 根据权利要求1所述的训练方法,所述样本图像中包含多个所述目标;所述以所述样本图像中的若干点为检测点,基于每个所述检测点与所述实际区域的预设点之间的距离,将至少一个所述检测点确定为所述目标的正样本点,包括:对所述样本图像进行降采样,得到对应不同分辨率的多个特征图;基于所述目标的实际区域的尺寸,将所述多个目标的实际区域与所述多个特征图进行分组;其中,尺寸越大的所述实际区域与分辨率越小的所述特征图作为同一分组;对于同一分组的特征图和所述目标的实际区域,将所述特征图中的每个点确定为检测点;基于每个所述检测点与所述实际区域的预设点之间的距离,将至少一个所述检测点确定为所述目标的正样本点。
- 根据权利要求2所述的训练方法,所述特征图为m个;所述基于所述目标的实际区域的尺寸,将所述多个目标的实际区域与所述多个特征图进行分组,包括:计算每个所述目标的实际区域的面积,将所述面积的最大值和最小值之间的范围划分为从小到大排序的m个区间;将所述m个特征图按照分辨率从大到小排列,并将面积属于第i个区间的所述目标的实际区域与第i个特征图划分至同一分组;其中,i和m为正整数,且i为0至m之间的值。
- 根据权利要求1至3任一项所述的训练方法,所述基于每个所述检测点与所述实际区域的预设点之间的距离,将至少一个所述检测点确定为所述目标的正样本点,包括:获得每个所述检测点与所述实际区域的预设点之间的距离;将与所述预设点之间的距离满足预设条件的至少一个所述检测点确定为所述目标的正样本点。
- 根据权利要求4所述的训练方法,所述将与所述预设点之间的距离满足预设条件的至少一个所述检测点确定为所述目标的正样本点,包括:将与所述预设点之间的距离最近的前若干个检测点确定为所述目标的正样本点。
- 根据权利要求1所述的训练方法,预测区域信息包括所述正样本点对应的预测区域的预测位置信息和所述预测区域的预测置信度,所述利用所述实际位置信息与所述预测区域信息,确定所述目标检测模型的损失值,包括:利用每个目标的所述实际位置信息与所述预测位置信息,得到位置损失值;利用所述预测置信度,得到置信度损失值;基于所述位置损失值和所述置信度损失值,确定所述目标检测模型的损失值。
- 根据权利要求6所述的训练方法,所述实际位置信息包括所述实际区域的实际区域尺寸,所述预测位置信息包括所述预测区域的预测区域尺寸;所述利用每个目标的所述实际位置信息与所述预测位置信息,得到位置损失值,包括:利用每个所述目标的实际区域尺寸和预测区域尺寸,得到区域尺寸损失值;基于所述区域尺寸损失值,确定位置损失值。
- 根据权利要求7所述的训练方法,所述实际位置信息还包括所述实际区域的预设点位置;所述预测位置信息还包括所述预测区域的正样本点与所述实际区域的预设点之间的预测偏移信息;所述利用每个目标的所述实际位置信息与所述预测位置信息,得到位置损失值,还包括:计算所述目标的实际区域的预设点位置与对应的所述正样本点位置之间的实际偏移信息;利用所述实际偏移信息和所述预测偏移信息,得到偏移损失值;所述基于所述区域尺寸损失值,确定位置损失值,包括:基于所述区域尺寸损失值和所述偏移损失值,确定位置损失值。
- 根据权利要求6所述的训练方法,在所述基于每个所述检测点与所述实际区域的预设点之间的距离,选择至少一个所述检测点作为所述目标的正样本点之后,还包括:将剩余的所述检测点作为负样本点;所述利用目标检测模型对所述样本图像进行目标检测,得到每个所述正样本点 对应的预测区域信息,包括:利用目标检测模型对所述样本图像进行目标检测,得到每个所述正样本点对应的预测区域信息和每个所述负样本点对应的预测区域信息;所述利用所述预测置信度,得到置信度损失值,包括:利用所述正样本点对应的预测置信度和所述负样本点对应的预测置信度,得到置信度损失值。
- 根据权利要求1所述的训练方法,所述样本图像为二维图像或三维图像,所述实际区域为实际边界框,所述预测区域为预测边界框。
- 一种目标检测方法,包括:获取待测图像;利用目标检测模型对所述待测图像进行目标检测,得到与所述待测图像中的目标对应的目标区域信息;其中,所述目标检测模型是通过权利要求1至10任一项所述的目标检测模型的训练方法得到的。
- 一种目标检测模型的训练装置,包括:图像获取模块,配置为获取样本图像,其中,所述样本图像标注有目标所在的实际区域的实际位置信息;样本选取模块,配置为以所述样本图像中的若干点为检测点,基于每个所述检测点与所述实际区域的预设点之间的距离,将至少一个所述检测点确定为所述目标的正样本点;目标检测模块,配置为利用目标检测模型对所述样本图像进行目标检测,确定每个所述正样本点对应的预测区域信息;损失确定模块,配置为利用所述实际位置信息与所述预测区域信息,确定所述目标检测模型的损失值;参数调整模块,配置为基于所述目标检测模型的损失值,调整所述目标检测模型的参数。
- 根据权利要求12所述的目标检测模型的训练装置,所述样本图像中包含多个所述目标;所述样本选取模块包括:降采样子模块,配置为对所述样本图像进行降采样,得到对应不同分辨率的多个特征图;分组子模块,配置为基于所述目标的实际区域的尺寸,将所述多个目标的实际区域与所述多个特征图进行分组;其中,尺寸越大的所述实际区域与分辨率越小的所述特征图作为同一分组;选取子模块,配置为对于同一分组的特征图和所述目标的实际区域,将所述特 征图中的每个点确定为检测点;基于每个所述检测点与所述实际区域的预设点之间的距离,将至少一个所述检测点确定为所述目标的正样本点。
- 根据权利要求13所述的目标检测模型的训练装置,所述特征图为m个;分组子模块包括:区间划分部分,配置为计算每个所述目标的实际区域的面积,将所述面积的最大值和最小值之间的范围划分为从小到大排序的m个区间;分组划分部分,配置为将所述m个特征图按照分辨率从大到小排列,并将面积属于第i个区间的所述目标的实际区域与第i个特征图划分至同一分组;其中,i和m为正整数,且i为0至m之间的值。
- 根据权利要求12至14任一项所述的目标检测模型的训练装置,所述样本选取模块还包括:距离计算子模块,配置为获得每个所述检测点与所述实际区域的预设点之间的距离;距离判断子模块,配置为将与所述预设点之间的距离满足预设条件的至少一个所述检测点确定为所述目标的正样本点。
- 根据权利要求15所述的目标检测模型的训练装置,所述距离判断子模块还配置为将与所述预设点之间的距离最近的前若干个检测点确定为所述目标的正样本点。
- 根据权利要求12所述的目标检测模型的训练装置,预测区域信息包括所述正样本点对应的预测区域的预测位置信息和所述预测区域的预测置信度,所述损失确定模块,包括:位置损失值计算子模块,配置为利用每个目标的所述实际位置信息与所述预测位置信息,得到位置损失值;置信度损失值计算子模块,配置为利用所述预测置信度,得到置信度损失值;模型损失值计算子模块,配置为基于所述位置损失值和所述置信度损失值,确定所述目标检测模型的损失值。
- 根据权利要求17所述的目标检测模型的训练装置,所述实际位置信息包括所述实际区域的实际区域尺寸,所述预测位置信息包括所述预测区域的预测区域尺寸;所述位置损失值计算子模块,包括:区域尺寸损失值计算部分,配置为利用每个所述目标的实际区域尺寸和预测区域尺寸,得到区域尺寸损失值;位置损失值计算部分,配置为基于所述区域尺寸损失值,确定位置损失值。
- 根据权利要求18所述的目标检测模型的训练装置,所述实际位置信息还包括所述实际区域的预设点位置;所述预测位置信息还包括所述预测区域的正样本点 与所述实际区域的预设点之间的预测偏移信息;区域尺寸损失值计算部分,还配置为计算所述目标的实际区域的预设点位置与对应的所述正样本点位置之间的实际偏移信息;利用所述实际偏移信息和所述预测偏移信息,得到偏移损失值;位置损失值计算部分,还配置为基于所述区域尺寸损失值和所述偏移损失值,确定位置损失值。
- 根据权利要求19所述的目标检测模型的训练装置,样本选取模块还包括:负样本选取子模块,配置为将剩余的所述检测点作为负样本点;目标检测模块配置为利用目标检测模型对所述样本图像进行目标检测,得到每个所述正样本点对应的预测区域信息和每个所述负样本点对应的预测区域信息;置信度损失值计算子模块,配置为利用所述正样本点对应的预测置信度和所述负样本点对应的预测置信度,得到置信度损失值。
- 根据权利要求12所述的目标检测模型的训练装置,所述样本图像为二维图像或三维图像,所述实际区域为实际边界框,所述预测区域为预测边界框。
- 一种目标检测装置,包括:图像获取模块,配置为获取待测图像;目标检测模块,配置为利用目标检测模型对所述待测图像进行目标检测,得到与所述待测图像中的目标对应的目标区域信息;其中,所述目标检测模型是通过权利要求12所述的目标检测模型的训练装置得到的。
- 一种电子设备,包括相互耦接的存储器和处理器,所述处理器配置为执行所述存储器中存储的程序指令,以实现权利要求1至10任一项所述的目标检测模型的训练方法,或实现权利要求11所述的目标检测方法。
- 一种计算机可读存储介质,其上存储有程序指令,所述程序指令被处理器执行时实现权利要求1至10任一项所述的目标检测模型的训练方法,或实现权利要求11所述的目标检测方法。
- 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行配置为实现权利要求1至10任一项所述的目标检测模型的训练方法,或者权利要求11所述的目标检测方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020217034041A KR20210141650A (ko) | 2020-03-11 | 2020-07-07 | 타깃 검출 방법 및 타깃 검출 모델의 트레이닝 방법, 장치 및 전자 기기 |
JP2021563131A JP2022529838A (ja) | 2020-03-11 | 2020-07-07 | ターゲット検出方法及びそのモデルの訓練方法、装置並びに電子機器 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010167104.7A CN111508019A (zh) | 2020-03-11 | 2020-03-11 | 目标检测方法及其模型的训练方法及相关装置、设备 |
CN202010167104.7 | 2020-03-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021179498A1 true WO2021179498A1 (zh) | 2021-09-16 |
Family
ID=71863905
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/100704 WO2021179498A1 (zh) | 2020-03-11 | 2020-07-07 | 目标检测方法及其模型的训练方法、装置及电子设备 |
Country Status (5)
Country | Link |
---|---|
JP (1) | JP2022529838A (zh) |
KR (1) | KR20210141650A (zh) |
CN (1) | CN111508019A (zh) |
TW (1) | TW202135006A (zh) |
WO (1) | WO2021179498A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114663731A (zh) * | 2022-05-25 | 2022-06-24 | 杭州雄迈集成电路技术股份有限公司 | 车牌检测模型的训练方法及***、车牌检测方法及*** |
CN115205555A (zh) * | 2022-07-12 | 2022-10-18 | 北京百度网讯科技有限公司 | 确定相似图像的方法、训练方法、信息确定方法及设备 |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112132206A (zh) * | 2020-09-18 | 2020-12-25 | 青岛商汤科技有限公司 | 图像识别方法及相关模型的训练方法及相关装置、设备 |
CN112328715B (zh) * | 2020-10-16 | 2022-06-03 | 浙江商汤科技开发有限公司 | 视觉定位方法及相关模型的训练方法及相关装置、设备 |
CN112232431A (zh) * | 2020-10-23 | 2021-01-15 | 携程计算机技术(上海)有限公司 | 水印检测模型训练方法、水印检测方法、***、设备及介质 |
CN112348892A (zh) * | 2020-10-29 | 2021-02-09 | 上海商汤智能科技有限公司 | 点定位方法及相关装置、设备 |
CN112669293A (zh) * | 2020-12-31 | 2021-04-16 | 上海商汤智能科技有限公司 | 图像检测方法和检测模型的训练方法及相关装置、设备 |
CN113435260A (zh) * | 2021-06-07 | 2021-09-24 | 上海商汤智能科技有限公司 | 图像检测方法和相关训练方法及相关装置、设备及介质 |
CN113256622A (zh) * | 2021-06-28 | 2021-08-13 | 北京小白世纪网络科技有限公司 | 基于三维图像的目标检测方法、装置及电子设备 |
CN113642431B (zh) * | 2021-07-29 | 2024-02-06 | 北京百度网讯科技有限公司 | 目标检测模型的训练方法及装置、电子设备和存储介质 |
CN113705672B (zh) * | 2021-08-27 | 2024-03-26 | 国网浙江省电力有限公司双创中心 | 图像目标检测的阈值选取方法、***、装置及存储介质 |
US11967137B2 (en) * | 2021-12-02 | 2024-04-23 | International Business Machines Corporation | Object detection considering tendency of object location |
WO2024118670A1 (en) * | 2022-11-29 | 2024-06-06 | Merck Sharp & Dohme Llc | 3d segmentation of lesions in ct images using self-supervised pretraining with augmentation |
CN117557788B (zh) * | 2024-01-12 | 2024-03-26 | 国研软件股份有限公司 | 一种基于运动预测的海上目标检测方法及*** |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109697460A (zh) * | 2018-12-05 | 2019-04-30 | 华中科技大学 | 对象检测模型训练方法、目标对象检测方法 |
US20190294177A1 (en) * | 2018-03-20 | 2019-09-26 | Phantom AI, Inc. | Data augmentation using computer simulated objects for autonomous control systems |
CN110598764A (zh) * | 2019-08-28 | 2019-12-20 | 杭州飞步科技有限公司 | 目标检测模型的训练方法、装置及电子设备 |
CN110599503A (zh) * | 2019-06-18 | 2019-12-20 | 腾讯科技(深圳)有限公司 | 检测模型训练方法、装置、计算机设备和存储介质 |
CN110827253A (zh) * | 2019-10-30 | 2020-02-21 | 北京达佳互联信息技术有限公司 | 一种目标检测模型的训练方法、装置及电子设备 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6431302B2 (ja) * | 2014-06-30 | 2018-11-28 | キヤノン株式会社 | 画像処理装置、画像処理方法及びプログラム |
JP2017059207A (ja) * | 2015-09-18 | 2017-03-23 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | 画像認識方法 |
KR101879207B1 (ko) * | 2016-11-22 | 2018-07-17 | 주식회사 루닛 | 약한 지도 학습 방식의 객체 인식 방법 및 장치 |
CN108304761A (zh) * | 2017-09-25 | 2018-07-20 | 腾讯科技(深圳)有限公司 | 文本检测方法、装置、存储介质和计算机设备 |
CN108229307B (zh) * | 2017-11-22 | 2022-01-04 | 北京市商汤科技开发有限公司 | 用于物体检测的方法、装置和设备 |
CN108710868B (zh) * | 2018-06-05 | 2020-09-04 | 中国石油大学(华东) | 一种基于复杂场景下的人体关键点检测***及方法 |
CN110084253A (zh) * | 2019-05-05 | 2019-08-02 | 厦门美图之家科技有限公司 | 一种生成物体检测模型的方法 |
CN110298298B (zh) * | 2019-06-26 | 2022-03-08 | 北京市商汤科技开发有限公司 | 目标检测及目标检测网络的训练方法、装置及设备 |
-
2020
- 2020-03-11 CN CN202010167104.7A patent/CN111508019A/zh not_active Withdrawn
- 2020-07-07 JP JP2021563131A patent/JP2022529838A/ja active Pending
- 2020-07-07 KR KR1020217034041A patent/KR20210141650A/ko active Search and Examination
- 2020-07-07 WO PCT/CN2020/100704 patent/WO2021179498A1/zh active Application Filing
-
2021
- 2021-01-29 TW TW110103579A patent/TW202135006A/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190294177A1 (en) * | 2018-03-20 | 2019-09-26 | Phantom AI, Inc. | Data augmentation using computer simulated objects for autonomous control systems |
CN109697460A (zh) * | 2018-12-05 | 2019-04-30 | 华中科技大学 | 对象检测模型训练方法、目标对象检测方法 |
CN110599503A (zh) * | 2019-06-18 | 2019-12-20 | 腾讯科技(深圳)有限公司 | 检测模型训练方法、装置、计算机设备和存储介质 |
CN110598764A (zh) * | 2019-08-28 | 2019-12-20 | 杭州飞步科技有限公司 | 目标检测模型的训练方法、装置及电子设备 |
CN110827253A (zh) * | 2019-10-30 | 2020-02-21 | 北京达佳互联信息技术有限公司 | 一种目标检测模型的训练方法、装置及电子设备 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114663731A (zh) * | 2022-05-25 | 2022-06-24 | 杭州雄迈集成电路技术股份有限公司 | 车牌检测模型的训练方法及***、车牌检测方法及*** |
CN115205555A (zh) * | 2022-07-12 | 2022-10-18 | 北京百度网讯科技有限公司 | 确定相似图像的方法、训练方法、信息确定方法及设备 |
CN115205555B (zh) * | 2022-07-12 | 2023-05-26 | 北京百度网讯科技有限公司 | 确定相似图像的方法、训练方法、信息确定方法及设备 |
Also Published As
Publication number | Publication date |
---|---|
CN111508019A (zh) | 2020-08-07 |
KR20210141650A (ko) | 2021-11-23 |
TW202135006A (zh) | 2021-09-16 |
JP2022529838A (ja) | 2022-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021179498A1 (zh) | 目标检测方法及其模型的训练方法、装置及电子设备 | |
US11049014B2 (en) | Learning apparatus, detecting apparatus, learning method, and detecting method | |
WO2020215672A1 (zh) | 医学图像病灶检测定位方法、装置、设备及存储介质 | |
WO2021128825A1 (zh) | 三维目标检测及模型的训练方法及装置、设备、存储介质 | |
WO2021000423A1 (zh) | 一种生猪体重测量方法及装置 | |
US20180025249A1 (en) | Object Detection System and Object Detection Method | |
US9330336B2 (en) | Systems, methods, and media for on-line boosting of a classifier | |
CN110738235B (zh) | 肺结核判定方法、装置、计算机设备及存储介质 | |
WO2023155494A1 (zh) | 图像检测及训练方法、相关装置、设备、介质和程序产品 | |
CN112614133B (zh) | 一种无锚点框的三维肺结节检测模型训练方法及装置 | |
WO2023138190A1 (zh) | 目标检测模型的训练方法及对应的检测方法 | |
CN110610472A (zh) | 实现肺结节图像分类检测的计算机装置及方法 | |
KR20200062589A (ko) | 뇌 mri 영상의 뇌 영역별 분할을 통한 치매 예측 장치 및 방법 | |
WO2022257314A1 (zh) | 图像检测方法和相关训练方法及相关装置、设备及介质 | |
CN109448854A (zh) | 一种肺结核检测模型的构建方法及应用 | |
US20160171717A1 (en) | State estimation apparatus, state estimation method, integrated circuit, and non-transitory computer-readable storage medium | |
CN110533120B (zh) | 器官结节的图像分类方法、装置、终端及存储介质 | |
WO2023092959A1 (zh) | 图像分割方法及其模型的训练方法及相关装置、电子设备 | |
CN113240699B (zh) | 图像处理方法及装置,模型的训练方法及装置,电子设备 | |
CN112488178B (zh) | 网络模型的训练方法及装置、图像处理方法及装置、设备 | |
JP7484492B2 (ja) | レーダーに基づく姿勢認識装置、方法及び電子機器 | |
CN113192085A (zh) | 三维器官图像分割方法、装置及计算机设备 | |
JP7239002B2 (ja) | 物体数推定装置、制御方法、及びプログラム | |
CN116912258B (zh) | 一种肺部ct图像病灶参数自效估计方法 | |
WO2023226793A1 (zh) | 二尖瓣开口间距检测方法、电子设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 20217034041 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021563131 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20924072 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20924072 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 28.03.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20924072 Country of ref document: EP Kind code of ref document: A1 |