CN115359511A - Pig abnormal behavior detection method - Google Patents

Pig abnormal behavior detection method Download PDF

Info

Publication number
CN115359511A
CN115359511A CN202210934696.XA CN202210934696A CN115359511A CN 115359511 A CN115359511 A CN 115359511A CN 202210934696 A CN202210934696 A CN 202210934696A CN 115359511 A CN115359511 A CN 115359511A
Authority
CN
China
Prior art keywords
image
abnormal
representing
pigs
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210934696.XA
Other languages
Chinese (zh)
Inventor
杨秋妹
陈淼彬
肖德琴
刘啸虎
康俊琪
黄一桂
周家鑫
刘克坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Agricultural University
Original Assignee
South China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Agricultural University filed Critical South China Agricultural University
Priority to CN202210934696.XA priority Critical patent/CN115359511A/en
Publication of CN115359511A publication Critical patent/CN115359511A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Image Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)

Abstract

The invention provides a method for detecting abnormal behaviors of pigs, which comprises the following steps: s1: extracting images frame by frame from live videos of pigs acquired in real time; s2: carrying out target detection and cutting on each extracted frame image by adopting an improved Yolov5n model to obtain a target screenshot of each pig in each frame image; s3: extracting feature vectors from a coder through double-current convolution; s4: clustering and classifying the feature vectors by adopting K-means and a classification algorithm; s5: obtaining classification scores of all targets of the current frame through a classifier, and combining all the classification scores to form an abnormal prediction graph; s6: performing Gaussian filtering time sequence smoothing on the abnormal prediction image, and recording the obtained highest classification score as the abnormal score of the current frame image; s7: judging whether the abnormal score of the current frame image is a positive number or not; if yes, no abnormal behavior exists, otherwise, the abnormal behavior exists. The method for detecting the abnormal behavior of the pig solves the problem that the conventional abnormal detection method cannot realize universal detection of the abnormal behavior of the pig.

Description

Method for detecting abnormal behaviors of pigs
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a pig abnormal behavior detection method.
Background
In animal husbandry, particularly in pig farms in enclosed farming environments, infectious diseases between animals cause severe damage to their welfare, are prone to fatal infections, and cause significant economic losses to farmers. The necessary condition for realizing welfare breeding of live pigs not only provides a good living environment for swinery, but also continuously monitors animal behaviors to discover abnormality as soon as possible for timely diagnosis and treatment, thereby realizing benefit maximization.
The behavior of the pigs reflects the welfare condition and social interaction of animals, and is an important basis for analyzing the health condition of the pigs and managing healthy breeding. Close interactions between pigs may have a negative impact on the health of the pig and reduce animal welfare, for example seating can occur in both male and sows, especially in estrus, often as a result of pigs placing both forepaws on the body or head of another pig, which is simply lying down or rapidly dodging causing bruising, limping and leg fractures, which would result in severe economic losses to the animal industry. Therefore, by monitoring different abnormal behaviors of the live pigs in time, the abnormal conditions of the live pigs can be evaluated, so that diseases of the live pigs are prevented or the diseases are prevented from spreading, and the breeding welfare level of the live pigs is improved.
In recent years, a great deal of research has been conducted using neural network-based methods due to the remarkable expression of deep learning in the field of anomaly detection. However, implementing pig behavior monitoring in a closed pig farm breeding environment presents formidable challenges to computer vision, such as confusion between different pigs due to visual similarity, sudden movements due to aggressive behavior of pigs, frequent occlusion, pigs only crowding with each other, etc. Abnormal behavior detection training effects under supervised learning are susceptible to video surveillance data set distribution imbalance, and the performance of the abnormal behavior detection training effects depends on the usability and quality of a manually annotated training data set to a great extent and is not suitable for video-based abnormal behavior detection. Therefore, from the perspective of unsupervised learning, a large number of researchers propose a method for detecting abnormal behaviors of self-adaptive video data without the need of labeled data set training, and achieve better effects on various data sets. However, the existing unsupervised method mainly designs a targeted algorithm to identify specific abnormal behaviors such as attack, tail biting, climbing and the like, and has the defect that a special algorithm is designed to detect only one abnormal behavior, and the general abnormal behavior detection of pigs cannot be realized.
Disclosure of Invention
The invention provides a pig abnormal behavior detection method for overcoming the technical defect that the conventional abnormal detection method cannot realize the universal pig abnormal behavior detection.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method for detecting abnormal behaviors of pigs comprises the following steps:
s1: acquiring live videos of pigs in real time, and extracting images from the live videos of the pigs frame by frame;
s2: carrying out target detection and cutting on each extracted frame image by adopting an improved Yolov5n network model to respectively obtain a target screenshot of each pig in each frame image;
the improved Yolov5n network model is as follows: adding a channel attention module after the 4 th layer, the 6 th layer and the 8 th layer of a trunk feature extraction network of the existing Yolov5n network model, splicing the channel attention module with the upper sampling layers of the 18 th layer, the 22 th layer and the 26 th layer of the neck network, and adding a C3 layer and a channel attention module after the 11 th layer of the trunk feature extraction network;
s3: constructing an end-to-end trainable double-flow convolution automatic encoder network based on an object as a center, extracting appearance characteristic vectors and motion characteristic vectors of all pigs in a target screenshot, and performing characteristic fusion to form characteristic vectors of corresponding frames;
the double-current convolution automatic encoder network only adopts images of normal behaviors of pigs for training;
s4: clustering the fusion characteristic vectors by adopting a K-means clustering algorithm, and inputting the result into a binary classifier for training to obtain a trained classifier;
s5: in each frame of image, obtaining classification scores of all target screenshots in a current frame of image through a classifier, and combining all classification scores to form an abnormal prediction image of the current frame of image;
s6: performing Gaussian filtering time sequence smoothing on the abnormal prediction image of the current frame image, and recording the obtained highest classification score as the abnormal score of the current frame image;
s7: judging whether the abnormal score of the current frame image is a positive number;
if yes, the pigs in the current frame image have no abnormal behaviors;
if not, the pigs in the current frame image only have abnormal behaviors.
According to the scheme, the improved Yolov5n network model is adopted to perform target detection and cutting on each frame of image to obtain the target screenshot of each pig in each frame of image, all pigs can be effectively detected in the actual pig farm environment shielded in a complex way, then the target screenshot is classified by the classifier which is trained only by the images of the normal behaviors of the pigs to obtain the classification score, the abnormal score of each frame of image is further obtained, the abnormal behavior detection of the pigs is finally realized according to the abnormal score, the lack of the training data of the actual abnormal behaviors is made up, and the abnormal behaviors of the pigs can be accurately identified.
Preferably, the channel attention module comprises compression, excitation and zoom operations; wherein the content of the first and second substances,
the compression operation is as follows: compressing the dimension H W C of the original feature layer to 1W 1C using global average pooling;
the excitation operation is as follows: fusing feature graph information of each feature channel by using two full-connection layers, and then normalizing the weight by using a Sigmoid function;
the zooming operation is as follows: and mapping the weight output after the excitation operation into the weight of a group of characteristic channels, and then multiplying and weighting the weight by the characteristics of the original characteristic diagram to realize the characteristic recalibration of the original characteristics on the channel dimension.
Preferably, the improvement of the Yolov5n network model further comprises adding a 64-time down-sampling detection layer, so that the scale of the output feature map is 20 × 20.
Preferably, in step S5, a target screenshot is selected from the current frame image, the feature vectors of the selected target screenshot are extracted and clustered into k clusters through step S3, then the clustering results are respectively input into k classifiers to obtain k classification scores, the highest classification score is selected as the abnormal score of the selected target screenshot, and the step is repeated until the abnormal classification scores of all target screenshots in the current frame image are obtained.
Preferably, the classifier is a binary classifier, and the ith binary classifier is defined as follows:
Figure BDA0003783024140000031
wherein, w j Representing weight vectors, b bias values, x samples input to a binary classifier, x canIs classified as normal or abnormal, x j Represents the jth element of the sample, and m represents the dimension of x.
Preferably, the k binary classifiers are trained by:
a1: selecting images of normal behaviors of pigs from life videos of the pigs as training images;
a2: carrying out target detection and cutting on the training image by adopting an improved Yolov5n network model to respectively obtain target screenshots of all pigs in the training image;
a3: converting the target screenshot into a gray image, and subtracting a pixel value of an adjacent frame image of the training image to obtain a corresponding frame difference image;
a4: respectively taking the gray frame image and the gray frame difference image obtained in the step A3 as the input of an external viewing sub-network and an action sub-network in the convolution automatic encoder network for abnormal behaviors of the pigs taking the object as the center, and extracting the appearance characteristic vector and the action characteristic vector of each pig in the target screenshot through the network;
the auto-encoder network comprises a look sub-network for extracting look feature vectors from the target screenshots and an action sub-network for extracting action feature vectors from the frame difference images;
a5: fusing the appearance characteristic vector and the action characteristic vector to obtain a fused characteristic vector of the training image;
a6: performing k-means clustering on the fusion characteristic vectors to obtain a clustering result cluster i, i =1, 2.. Times, k;
a7: and inputting the clustering result into k binary classifiers to obtain k trained binary classifiers.
Preferably, the appearance sub-network and the action sub-network both comprise an attention module and a memory module; wherein, the first and the second end of the pipe are connected with each other,
the calculation formula of the attention module is as follows:
Figure BDA0003783024140000041
Figure BDA0003783024140000042
u t,t′ =a(s t-1 ,h t′ )
wherein, c t Representing the context vector at time T, T representing the total time length, α t,t′ Attention weight, h, representing the neighborhood of t at time t t′ Denotes the hidden unit output at time t', alpha denotes the attention weight, u t,t′ Output score, u, representing the neighborhood of t at time t t,k Representing the output score, s, of the k neighborhood at time t t-1 Representing a hidden state at time t-1;
the memory storage module comprises M memory items p m M =1, \ 8230;, M, various prototype feature patterns for recording normal behavior data of pigs;
mapping for each query
Figure BDA0003783024140000043
By having corresponding weights to pairs
Figure BDA0003783024140000044
Memory term p of m Performing weighted average to read memory item and obtain characteristics
Figure BDA0003783024140000045
Figure BDA0003783024140000046
Figure BDA0003783024140000047
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003783024140000048
representing memory items p m′ Weight of (b), p m′ Represents the m' thA memory item;
the update formula of the memory term is as follows:
Figure BDA0003783024140000049
Figure BDA0003783024140000051
Figure BDA0003783024140000052
where ← denotes update operation, f denotes L2 norm, v t ′k,m Representing a match probability value
Figure BDA0003783024140000053
The reconstruction of (a) is performed,
Figure BDA0003783024140000054
and representing the query index set of the memory storage module.
Preferably, when the memory term is updated, the weighted fraction epsilon of the tth frame image t If the image of the t frame is larger than the preset threshold value, the image of the t frame is regarded as an abnormal frame, and the abnormal frame is not used for updating the memory item;
the weighted fraction ε is calculated by the following formula t
Figure BDA0003783024140000055
Figure BDA0003783024140000056
Wherein the content of the first and second substances,
Figure BDA0003783024140000057
the weight value of the representation feature is given,
Figure BDA0003783024140000058
representing a certain feature in the neighborhood of t, I t The characteristic at the t-th time is shown, and i and j represent spatial indexes.
Preferably, the loss function of the automatic encoder
Figure BDA0003783024140000059
Comprises the following steps:
Figure BDA00037830241400000510
wherein the content of the first and second substances,
Figure BDA00037830241400000511
in order to reconstruct the error, the image is reconstructed,
Figure BDA00037830241400000512
in order to characterize a compact loss function,
Figure BDA00037830241400000513
in order to characterize the separation loss function,
Figure BDA00037830241400000514
is a hyper-parameter.
Preferably, the first and second liquid crystal materials are,
the reconstruction error is:
Figure BDA00037830241400000515
the characteristic compact loss function is:
Figure BDA00037830241400000516
Figure BDA00037830241400000517
the feature separation loss function is:
Figure BDA00037830241400000518
Figure BDA0003783024140000061
where T represents the total time, T represents the time index, K represents the index of the query map, K represents the total number of query maps,
Figure BDA0003783024140000062
representing a certain feature in the neighborhood of t, I t Features indicating time t, p p Representing query mappings
Figure BDA0003783024140000063
P is a query mapping
Figure BDA0003783024140000064
The index of the most recent item of (c),
Figure BDA0003783024140000065
represents the weight of the mth memory item, M represents the index of the memory item, M represents the total number of the memory items, p n Representing query mappings
Figure BDA0003783024140000066
The second most recent memory term of (c).
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a method for detecting abnormal behaviors of pigs, which is characterized in that target detection and cutting are carried out on each frame of image by adopting an improved Yolov5n network model to obtain a target screenshot of each pig in each frame of image, and all pigs can be effectively detected under the actual pig farm environment with complex shielding; and then constructing an end-to-end trainable double-current convolution automatic encoder network based on an object as a center, only training by adopting a video of the normal behavior of the pig, only paying attention to the pig object existing in the scene, not needing to manually extract image characteristics, accurately positioning the abnormality in each frame, and judging the size of the occurrence scale and the duration time of the abnormal behavior of the pig. Simultaneously, a memory module with the characteristics of learning and storing the prototype of the normal pig behavior and a memory updating strategy are provided; and then, solving the problem of detecting abnormal behaviors of the pigs by adopting an unsupervised two-classification method, clustering the characteristic vectors obtained by network learning of the automatic encoder, and using the obtained clustering result for training a classifier. The classification scores are obtained by classifying the target screenshots through a trained classifier, the abnormal score of each frame of image is further obtained, and finally the abnormal behavior detection of the pigs is realized according to the abnormal scores, so that the defect of training data of the actual abnormal behaviors is made up, and the abnormal behaviors of the pigs can be accurately identified.
Drawings
FIG. 1 is a flow chart of the steps for implementing the technical solution of the present invention;
FIG. 2 is a schematic structural diagram of an improved Yolov5n network model in the present invention;
FIG. 3 is a schematic flow chart of obtaining a frame difference map according to the present invention;
FIG. 4 is a schematic diagram of an object-centric pig abnormal behavior detection network according to the present invention;
FIG. 5 is a schematic diagram of a sub-network of the autoencoder network of the present invention;
FIG. 6 is a diagram illustrating the reading of memory items according to the present invention;
FIG. 7 is a diagram illustrating memory entry updating according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a method for detecting abnormal behavior of pigs comprises the following steps:
s1: acquiring live videos of pigs in real time, and extracting images from the live videos of the pigs frame by frame;
s2: carrying out target detection and cutting on each extracted frame image by adopting an improved Yolov5n network model to respectively obtain a target screenshot of each pig in each frame image;
the improved Yolov5n network model is as follows: adding a channel attention module after the 4 th layer, the 6 th layer and the 8 th layer of a trunk feature extraction network of the existing Yolov5n network model, splicing the channel attention module with the upper sampling layers of the 18 th layer, the 22 th layer and the 26 th layer of the neck network, and adding a C3 layer and a channel attention module after the 11 th layer of the trunk feature extraction network;
s3: constructing an end-to-end trainable double-flow convolution automatic encoder network based on an object as a center, extracting appearance characteristic vectors and motion characteristic vectors of all pigs in a target screenshot, and performing characteristic fusion to form characteristic vectors of corresponding frames;
the double-current convolution automatic encoder network only adopts images of normal behaviors of pigs for training;
s4: clustering the fusion characteristic vectors by adopting a K-means clustering algorithm, and inputting the result into a binary classifier for training to obtain a trained classifier;
s5: in each frame of image, obtaining classification scores of all target screenshots in the current frame of image through a classifier, and combining all the classification scores to form an abnormal prediction image of the current frame of image;
s6: performing Gaussian filtering time sequence smoothing on the abnormal prediction image of the current frame image, and recording the obtained highest classification score as the abnormal score of the current frame image;
s7: judging whether the abnormal score of the current frame image is a positive number or not;
if yes, the pigs in the current frame image only have no abnormal behaviors;
if not, the pig in the current frame image only has abnormal behaviors.
In the specific implementation process, the improved Yolov5n network model is adopted to carry out target detection and cutting on each frame of image to obtain a target screenshot of each pig in each frame of image, all pigs can be effectively detected under the actual pig farm environment with complex shielding, then the target screenshot is input into a double-current convolution automatic encoder network to extract the appearance and motion characteristic vectors of each pig in the target screenshot, the fused characteristic vectors are clustered, and the obtained clustering result is used for training the classifier. The classification scores are obtained by classifying the target screenshots through the trained classifier, the abnormal scores of all frames of images are further obtained, and finally the abnormal behavior detection of the pigs is realized according to the abnormal scores, so that the lack of training data of the actual abnormal behaviors is made up, and the abnormal behaviors of the pigs can be accurately identified.
Example 2
A method for detecting abnormal behaviors of pigs comprises the following steps:
s1: acquiring live videos of pigs in real time, and extracting images from the live videos of the pigs frame by frame;
s2: carrying out target detection and cutting on each extracted frame image by adopting an improved Yolov5n network model to respectively obtain a target screenshot of each pig in each frame image;
as shown in fig. 2, the improved Yolov5n network model is: adding a channel attention module after the 4 th layer, the 6 th layer and the 8 th layer of a trunk feature extraction network of the existing Yolov5n network model, splicing the channel attention module with the upper sampling layers of the 18 th layer, the 22 th layer and the 26 th layer of the neck network, and adding a C3 layer and a channel attention module after the 11 th layer of the trunk feature extraction network;
in practical implementation, aiming at the problems that the boundary frame positioning is not accurate enough, so that overlapped objects are difficult to distinguish, the robustness is poor and the like in the existing Yolov5n network model, the embodiment adds an attention module of an SE-Net channel in a Backbone feature extraction network backhaul for improvement, establishes feature mapping in the interaction between convolution network channels, enables the network model to automatically learn global feature information and highlight useful feature information, and inhibits other less important feature information at the same time, so that the network model is more focused on the purpose of training a shielding object.
More specifically, the channel attention module comprises compression, excitation and scaling operations; wherein the content of the first and second substances,
the compression operation is as follows: compressing the dimension H W C of the original feature layer to 1W 1C using global average pooling;
the excitation operation is as follows: fusing feature graph information of each feature channel by using two full-connection layers, and then normalizing the weight by using a Sigmoid function;
the zooming operation is as follows: and mapping the weight output after the excitation operation into the weight of a group of characteristic channels, and then multiplying and weighting the weight by the characteristics of the original characteristic diagram to realize the characteristic recalibration of the original characteristics on the channel dimension.
More specifically, the improvement of the Yolov5n network model further comprises adding a 64-fold downsampling detection layer, so that the scale of the feature map of the output is 20 × 20.
In the specific implementation process, on the basis of original 3 detection layers with different scales (40x 40, 80x80 and 160x 160) of a Yolov5n network model, a detection layer with an ultra-small scale (20x 20) is added, namely after 8-time, 16-time and 32-time down-sampling, a 64-time down-sampling detection layer is added, so that a detection layer characteristic diagram with a 20x 20 scale is obtained, the network depth is further deepened, the network model can extract semantic information with higher levels, the information is richer, the multi-scale learning capacity of the model in a complex scene is enhanced, and the detection performance of the model is improved.
S3: constructing an end-to-end trainable double-flow convolution automatic encoder network based on an object as a center, extracting appearance characteristic vectors and motion characteristic vectors of all pigs in a target screenshot, and performing characteristic fusion to form characteristic vectors of corresponding frames;
the double-current convolution automatic encoder network only adopts images of normal behaviors of pigs for training;
s4: clustering the fusion characteristic vectors by adopting a K-means clustering algorithm, and inputting the result into a binary classifier for training to obtain a trained classifier;
s5: in each frame of image, obtaining classification scores of all target screenshots in the current frame of image through a classifier, and combining all the classification scores to form an abnormal prediction image of the current frame of image;
more specifically, in step S5, a target screenshot is selected from the current frame image, the feature vectors of the selected target screenshot are extracted and clustered into k clusters through step S3, then the clustering results are respectively input into k classifiers to obtain k classification scores, the highest classification score is selected as the abnormal score of the selected target screenshot, and the steps are repeated until the abnormal classification scores of all target screenshots in the current frame image are obtained.
More specifically, the classifier is a binary classifier, and the ith binary classifier is defined as follows:
Figure BDA0003783024140000091
wherein, w j Representing a weight vector, b representing a bias value, x representing a sample input to a binary classifier, x being able to be classified as a normal sample or an abnormal sample, x j Represents the jth element of the sample and m represents the dimension of x.
More specifically, k binary classifiers are trained by:
a1: selecting images of normal behaviors of pigs from life videos of the pigs as training images;
in practical implementation, the image containing a plurality of swineries is subjected to mask processing, a mask layer is added on the original image by taking the check swinery as a boundary, pigs of other columns are covered, a training data set of multiple scenes (different pigsties, different numbers of pigs, different shielding degrees, different illumination, different shapes of pigs and the like) is constructed according to the mask layer, and the pigs on the image are manually marked;
a2: carrying out target detection and cutting on the training image by adopting an improved Yolov5n network model to respectively obtain target screenshots of all pigs in the training image;
in practical implementation, because the default of the preset anchor frame of the existing Yolov5n network model is mainly for the coco data set (microsoft provides a public data set), which is completely different from the length-width ratio of the label frame of the training data set in the embodiment (the maximum length-width ratio of the label frame of the coco data set reaches 1.
A3: converting the target screenshot into a gray image, and subtracting the pixel value of the target screenshot from the pixel value of an adjacent frame image of the training image to obtain a corresponding frame difference image, as shown in fig. 3;
a4: respectively taking the gray frame image and the gray frame difference image obtained in the step A3 as the input of an appearance sub-network and an action sub-network in the convolution automatic encoder network for abnormal behaviors of the pigs taking the object as the center, and extracting appearance characteristic vectors and action characteristic vectors through the network, as shown in a figure 4;
the auto-encoder network comprises a look sub-network for extracting look feature vectors from the target screenshot and an action sub-network for extracting action feature vectors from the frame difference image;
more specifically, the appearance sub-network and the action sub-network both comprise an attention module and a memory module; wherein the content of the first and second substances,
the calculation formula of the attention module is as follows:
Figure BDA0003783024140000101
Figure BDA0003783024140000102
u t,t′ =a(s t-1 ,h t′ )
wherein, c t Representing the context vector at time T, T representing the total time length, alpha t,t′ Attention weight, h, representing the neighborhood of t at time t t′ Denotes the hidden unit output at time t', alpha denotes the attention weight, u t,t′ Output score, u, representing the neighborhood of t at time t t,k Output score, s, representing k neighborhood at time t t-1 A hidden state representing time t-1;
in a specific implementation process, the automatic encoder network has a dual-flow structure composed of two sub-networks, namely an appearance sub-network and an action sub-network, wherein the two sub-networks include an encoder, a memory storage module and a decoder, and the encoder and the decoder include a space convolution layer, three convolution LSTM layers (ConvLSTM), three attention modules and two maximum pooling layers (MaxPool), as shown in fig. 5.
By constructing an end-to-end trainable double-flow convolution automatic encoder network based on an object as a center, only a pig object existing in a scene is concerned, image features do not need to be extracted manually, meanwhile, the abnormality in each frame can be accurately positioned, the size of the occurrence scale and the duration of the abnormal behavior of the pig are judged, and the method has the technical advantages of time saving, high efficiency, high accuracy and high robustness.
The memory module comprises M memory items p m M =1, \8230M, various prototype characteristic patterns for recording normal behavior data of pigs;
as shown in fig. 6, C in fig. 6 represents calculating cosine similarity of the two, S represents softmax function, and W represents weighted average; for reading memory items by computing per-query mappings
Figure BDA0003783024140000111
And all memory items p m Cosine similarity between the two images to obtain a two-dimensional graph with the size of M multiplied by K, and then applying a softmax function along the vertical direction to obtain the reading matching probability
Figure BDA0003783024140000112
Figure BDA0003783024140000113
Mapping for each query
Figure BDA0003783024140000114
By having corresponding weights to pairs
Figure BDA0003783024140000115
Memory term p of m Performing weighted average to read memory item and obtain characteristics
Figure BDA0003783024140000116
Figure BDA0003783024140000117
Wherein the content of the first and second substances,
Figure BDA0003783024140000118
representing memory items p m′ Weight of p m′ Represents the m' th memory item;
as shown in fig. 7, C in fig. 7 represents calculating cosine similarity of the two, S represents a softmax function, W represents a weighted average, and n represents maximum normalization; for update operations, for each memory term p m By calculating
Figure BDA0003783024140000119
Further select the distance p m Recent query mapping
Figure BDA00037830241400001110
Then using the query index set
Figure BDA00037830241400001111
Updating the memory term, wherein the updating formula of the memory term is as follows:
Figure BDA00037830241400001112
Figure BDA00037830241400001113
Figure BDA00037830241400001114
where ← denotes update operation, f denotes L2 norm, v t ′k,m Representing write matching probability values
Figure BDA00037830241400001115
And (4) reconstructing.
A5: fusing the appearance characteristic vector and the action characteristic vector to obtain a fusion characteristic vector of the training image;
a6: performing k-means clustering on the fusion characteristic vectors to obtain a clustering result cluster i, i =1,2,.., k;
a7: and inputting the clustering result into k binary classifiers to obtain k trained binary classifiers.
In the specific implementation process, a context is constructed through K-means clustering, and in the context, one subset in a normal sample is equivalent to a pseudo-abnormal sample relative to the other subset, so that the problem of lack of a real abnormal sample is solved. K-means clustering clusters normal samples into K clusters, each cluster representing some normal behavior of the pig, respectively, different from the behavior represented by the other clusters, i.e. from the perspective of a given cluster i, samples belonging to other clusters (from the data set {1, 2., K } \ i \ representing other than i ]) can be considered as abnormal samples.
S6: performing Gaussian filtering time sequence smoothing on the abnormal prediction image of the current frame image, and recording the obtained highest classification score as the abnormal score of the current frame image;
s7: judging whether the abnormal score of the current frame image is a positive number;
if yes, the pigs in the current frame image have no abnormal behaviors;
if not, the pig in the current frame image only has abnormal behaviors.
Example 3
A method for detecting abnormal behaviors of pigs comprises the following steps:
s1: acquiring live videos of pigs in real time, and extracting images from the live videos of the pigs frame by frame;
s2: performing target detection and cutting on each extracted frame image by adopting an improved Yolov5n network model to respectively obtain a target screenshot of each pig in each frame image;
the improved Yolov5n network model is as follows: adding a channel attention module after the 4 th layer, the 6 th layer and the 8 th layer of a trunk feature extraction network of the existing Yolov5n network model, splicing the channel attention module with the upper sampling layers of the 18 th layer, the 22 th layer and the 26 th layer of the neck network, and adding a C3 layer and a channel attention module after the 11 th layer of the trunk feature extraction network;
more specifically, the channel attention module comprises compression, excitation and scaling operations; wherein, the first and the second end of the pipe are connected with each other,
the compression operation is as follows: compressing the dimension H x W x C of the original feature layer to 1 x C using global mean pooling;
the excitation operation is as follows: fusing feature graph information of each feature channel by using two full-connection layers, and then normalizing the weight by using a Sigmoid function;
the zooming operation comprises the following steps: and mapping the weight output after the excitation operation into the weight of a group of characteristic channels, and multiplying and weighting the weight by the characteristics of the original characteristic diagram to realize characteristic recalibration of the original characteristics on the channel dimension.
More specifically, the improvement of the Yolov5n network model further comprises adding a 64-time down-sampling detection layer, so that the scale of the output feature map is 20 × 20.
S3: constructing an end-to-end trainable double-flow convolution automatic encoder network based on an object as a center, extracting appearance characteristic vectors and motion characteristic vectors of all pigs in a target screenshot, and performing characteristic fusion to form characteristic vectors of corresponding frames;
the double-current convolution automatic encoder network only adopts images of normal behaviors of pigs for training;
s4: clustering the fusion feature vectors by adopting a K-means clustering algorithm, and inputting the result into a binary classifier for training to obtain a trained classifier; s5: in each frame of image, obtaining classification scores of all target screenshots in the current frame of image through a classifier, and combining all the classification scores to form an abnormal prediction image of the current frame of image;
more specifically, in step S5, a target screenshot is selected from the current frame image, the feature vectors of the selected target screenshot are extracted and clustered into k clusters through step S3, then the clustering results are respectively input into k classifiers to obtain k classification scores, the highest classification score is selected as the abnormal score of the selected target screenshot, and the step is repeated until the abnormal classification scores of all target screenshots in the current frame image are obtained.
More specifically, the classifier is a binary classifier, and the ith binary classifier is defined as follows:
Figure BDA0003783024140000131
wherein w j Representing a weight vector, b representing a bias value, x representing a sample input to a binary classifier, x being able to be classified as a normal sample or an abnormal sample, x j Represents the jth element of the sample, and m represents the dimension of x.
More specifically, k binary classifiers are trained by:
a1: selecting images of normal behaviors of pigs from life videos of the pigs as training images;
a2: carrying out target detection and cutting on the training image by adopting an improved Yolov5n network model to respectively obtain target screenshots of all pigs in the training image;
a3: converting the target screenshot into a gray image, and subtracting the pixel value of the gray image from the adjacent frame image of the training image to obtain a corresponding frame difference image;
a4: b, respectively taking the gray frame image and the gray frame difference image obtained in the step A3 as the input of an appearance sub-network and an action sub-network in the convolution automatic encoder network for abnormal behaviors of the pigs taking the object as the center, and extracting the appearance feature vector and the action feature vector of each pig in the target screenshot through the network;
the auto-encoder network comprises a look sub-network for extracting look feature vectors from the target screenshots and an action sub-network for extracting action feature vectors from the frame difference images;
more specifically, the appearance sub-network and the action sub-network both comprise an attention module and a memory module; wherein, the first and the second end of the pipe are connected with each other,
the calculation formula of the attention module is as follows:
Figure BDA0003783024140000141
Figure BDA0003783024140000142
u t,t′ =a(s t-1 ,h t′ )
wherein, c t Representing the context vector at time T, T representing the total time length, α t,t′ Attention weight, h, representing the neighborhood of t at time t t′ Denotes the hidden unit output at time t', alpha denotes the attention weight, u t,t′ Output score, u, representing the neighborhood of t at time t t,k Representing the output score, s, of the k neighborhood at time t t-1 Representing a hidden state at time t-1;
the memory storage module comprises M memory items p m M =1, \ 8230;, M, various prototype feature patterns for recording normal behavior data of pigs;
mapping for each query
Figure BDA0003783024140000143
By passingFor having corresponding weight
Figure BDA0003783024140000144
Memory term p of m Performing weighted average to read memory item and obtain characteristics
Figure BDA0003783024140000145
Figure BDA0003783024140000146
Figure BDA0003783024140000147
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003783024140000148
representing memory items p m′ Weight of p m′ Represents the m' th memory item;
the update formula of the memory term is as follows:
Figure BDA0003783024140000149
Figure BDA00037830241400001410
Figure BDA00037830241400001411
where ← denotes update operation, f denotes L2 norm, v t ′k,m Representing the probability value of a match
Figure BDA00037830241400001412
The reconstruction of (a) is performed,
Figure BDA00037830241400001413
representing the query index set of the memory storage module.
More specifically, when the memory item is updated, if the t-th frame image is weighted, the fraction epsilon is obtained t If the image is larger than the preset threshold value, the t frame image is regarded as an abnormal frame, and the abnormal frame is not used for updating the memory item;
the weighted fraction epsilon is calculated by the following formula t
Figure BDA0003783024140000151
Figure BDA0003783024140000152
Wherein the content of the first and second substances,
Figure BDA0003783024140000153
the weight value of the representation feature is given,
Figure BDA0003783024140000154
representing a certain feature in the neighborhood of t, I t The characteristic at the t-th moment is shown, and i and j represent spatial indexes.
In the specific implementation process, when normal samples and abnormal samples exist simultaneously, in order to prevent the memory from recording the characteristics of the abnormal samples of the pigs, the abnormal condition of the video frames is measured by using weighted rule score, and the memory item is updated only when the frames are determined to be normal.
More particularly, the loss function of the automatic encoder
Figure BDA0003783024140000155
Comprises the following steps:
Figure BDA0003783024140000156
wherein the content of the first and second substances,
Figure BDA0003783024140000157
in order to reconstruct the error,
Figure BDA0003783024140000158
in order to characterize a compact loss function,
Figure BDA0003783024140000159
in order to characterize the separation loss function,
Figure BDA00037830241400001510
is a hyper-parameter.
More specifically, the present invention is to provide a novel,
the reconstruction error is:
Figure BDA00037830241400001511
the characteristic compact loss function is:
Figure BDA00037830241400001512
Figure BDA00037830241400001513
the feature separation loss function is:
Figure BDA00037830241400001514
Figure BDA00037830241400001515
where T represents the total time, T represents the time index, l represents the index of the query map, K represents the total number of query maps,
Figure BDA00037830241400001516
representing a certain feature in the neighborhood of t, I t Features indicating time t, p p Representing query mappings
Figure BDA00037830241400001517
P is a query mapping
Figure BDA00037830241400001518
The index of the most recent item of (c),
Figure BDA00037830241400001519
representing the weight of the mth memory item, M representing the index of the memory item, M representing the total number of memory items, p n Representing query mappings
Figure BDA00037830241400001520
The second most recent memory entry of (2).
In a specific implementation process, a memory module of the automatic encoder is trained through a feature compact function and a feature classification loss function, so that the diversity and the discrimination of memory items are ensured.
A5: fusing the appearance characteristic vector and the action characteristic vector to obtain a fused characteristic vector of the training image;
a6: performing k-means clustering on the fusion characteristic vectors to obtain a clustering result cluster i, i =1, 2.. Times, k;
a7: and inputting the clustering result into k binary classifiers to obtain k trained binary classifiers.
S6: performing Gaussian filtering time sequence smoothing on the abnormal prediction image of the current frame image, and recording the obtained highest classification score as the abnormal score of the current frame image;
s7: judging whether the abnormal score of the current frame image is a positive number or not;
if yes, the pigs in the current frame image have no abnormal behaviors;
if not, the pig in the current frame image only has abnormal behaviors.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A method for detecting abnormal behaviors of pigs is characterized by comprising the following steps:
s1: acquiring live videos of pigs in real time, and extracting images from the live videos of the pigs frame by frame;
s2: carrying out target detection and cutting on each extracted frame image by adopting an improved Yolov5n network model to respectively obtain a target screenshot of each pig in each frame image;
the improved Yolov5n network model comprises the following steps: adding a channel attention module after the 4 th layer, the 6 th layer and the 8 th layer of a trunk feature extraction network of the existing Yolov5n network model, splicing the channel attention module with the upper sampling layers of the 18 th layer, the 22 th layer and the 26 th layer of the neck network, and adding a C3 layer and a channel attention module after the 11 th layer of the trunk feature extraction network;
s3: constructing an end-to-end trainable double-flow convolution automatic encoder network based on an object as a center, extracting appearance characteristic vectors and motion characteristic vectors of all pigs in a target screenshot, and performing characteristic fusion to form characteristic vectors of corresponding frames;
the double-current convolution automatic encoder network only adopts images of normal behaviors of pigs for training;
s4: clustering the fusion characteristic vectors by adopting a K-means clustering algorithm, and inputting the result into a binary classifier for training to obtain a trained classifier; s5: in each frame of image, obtaining classification scores of all target screenshots in a current frame of image through a classifier, and combining all classification scores to form an abnormal prediction image of the current frame of image;
s6: performing Gaussian filtering time sequence smoothing on the abnormal prediction image of the current frame image, and recording the obtained highest classification score as the abnormal score of the current frame image;
s7: judging whether the abnormal score of the current frame image is a positive number or not;
if yes, the pigs in the current frame image have no abnormal behaviors;
if not, the pigs in the current frame image only have abnormal behaviors.
2. The method of claim 1, wherein the channel attention module comprises compression, excitation, and scaling operations; wherein the content of the first and second substances,
the compression operation is as follows: compressing the dimension H W C of the original feature layer to 1W 1C using global average pooling;
the excitation operation is as follows: fusing feature map information of each feature channel by using two full connection layers, and then normalizing the weight by using a Sigmoid function;
the zooming operation comprises the following steps: and mapping the weight output after the excitation operation into the weight of a group of characteristic channels, and then multiplying and weighting the weight by the characteristics of the original characteristic diagram to realize the characteristic recalibration of the original characteristics on the channel dimension.
3. The method of claim 1, wherein the improvement of the Yolov5n network model further comprises adding a 64-fold down-sampling detection layer to make the feature map scale of the output 20 × 20.
4. The method of claim 1, wherein in step S5, a target screenshot is selected from the current frame image, the feature vectors of the selected target screenshot are extracted and clustered into k clusters through step S3, then the clustering results are respectively input into k classifiers to obtain k classification scores, the highest classification score is selected as the abnormal score of the selected target screenshot, and the steps are repeated until the abnormal classification scores of all target screenshots in the current frame image are obtained.
5. The method of claim 4, wherein the classifier is a binary classifier, and the ith binary classifier is defined as follows:
Figure FDA0003783024130000021
wherein w j Representing a weight vector, b representing a bias value, x representing a sample input to a binary classifier, x being able to be classified as a normal sample or an abnormal sample, x j Represents the jth element of the sample, and m represents the dimension of x.
6. The method of claim 5, wherein k binary classifiers are trained by:
a1: selecting images of normal behaviors of pigs from life videos of the pigs as training images;
a2: carrying out target detection and cutting on the training image by adopting an improved Yolov5n network model to respectively obtain target screenshots of all pigs in the training image;
a3: converting the target screenshot into a gray image, and subtracting the pixel value of the gray image from the adjacent frame image of the training image to obtain a corresponding frame difference image;
a4: b, respectively taking the gray frame image and the gray frame difference image obtained in the step A3 as the input of an appearance sub-network and an action sub-network in the convolution automatic encoder network for abnormal behaviors of the pigs taking the object as the center, and extracting the appearance feature vector and the action feature vector of each pig in the target screenshot through the network;
the auto-encoder network comprises a look sub-network for extracting look feature vectors from the target screenshots and an action sub-network for extracting action feature vectors from the frame difference images;
a5: fusing the appearance characteristic vector and the action characteristic vector to obtain a fused characteristic vector of the training image;
a6: performing k-means clustering on the fusion characteristic vectors to obtain a clustering result cluster i, i =1, 2.. Times, k;
a7: and inputting the clustering result into k binary classifiers to obtain k trained binary classifiers.
7. The method of claim 6, wherein the object-centric dual-stream convolutional automatic encoder network trained for motion and appearance comprises an appearance sub-network and an action sub-network, both sub-networks comprising an attention-based convolutional LSTM module and a memory module; wherein the content of the first and second substances,
the calculation formula of the attention module is as follows:
Figure FDA0003783024130000031
Figure FDA0003783024130000032
u t,t′ =a(s t-1 ,h t′ )
wherein, c t Representing the context vector at time T, T representing the total time length, α t,t′ Attention weight, h, representing the neighborhood of t at time t t′ Denotes the hidden unit output at time t', alpha denotes the attention weight, u t,t′ Output score, u, representing the neighborhood of t at time t t,k Representing the output score, s, of the k neighborhood at time t t-1 A hidden state representing time t-1;
the memory module comprises M memory items p m M = 1.. Said, M, various prototype signature patterns for recording normal behavior data of pigs;
mapping for each query
Figure FDA0003783024130000033
By having corresponding weights to pairs
Figure FDA0003783024130000034
Memory term p of m Performing weighted average to read memory item and obtain characteristics
Figure FDA0003783024130000035
Figure FDA0003783024130000036
Figure FDA0003783024130000037
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003783024130000038
representing memory items p m′ Weight of p m′ Represents the m' th memory item;
the update formula of the memory term is as follows:
Figure FDA0003783024130000039
Figure FDA00037830241300000310
Figure FDA00037830241300000311
where ← denotes the update operation, f denotes the L2 norm,
Figure FDA0003783024130000041
representing the probability value of a match
Figure FDA0003783024130000042
The reconstruction of (2) is performed,
Figure FDA0003783024130000043
representing the query index set of the memory storage module.
8. The method of claim 7, wherein the weighted score ε is determined if the t-th frame is weighted when updating the memory term t If the image of the t frame is larger than the preset threshold value, the image of the t frame is regarded as an abnormal frame, and the abnormal frame is not used for updating the memory item;
the weighted fraction epsilon is calculated by the following formula t
Figure FDA0003783024130000044
Figure FDA0003783024130000045
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003783024130000046
a weight value representing a characteristic of the image,
Figure FDA0003783024130000047
representing a certain feature in the neighborhood of t, I t The characteristic at the t-th time is shown, and i and j represent spatial indexes.
9. The method of claim 6, wherein the loss function of the automatic encoder is a loss function of the pig
Figure FDA0003783024130000048
Comprises the following steps:
Figure FDA0003783024130000049
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA00037830241300000410
in order to reconstruct the error,
Figure FDA00037830241300000411
in order to characterize a compact loss function,
Figure FDA00037830241300000412
in order to characterize the separation loss function,
Figure FDA00037830241300000413
is a hyper-parameter.
10. The method of claim 9, wherein the abnormal behavior of pig is detected,
the reconstruction error is:
Figure FDA00037830241300000414
the characteristic compact loss function is:
Figure FDA00037830241300000415
Figure FDA00037830241300000416
the feature separation loss function is:
Figure FDA00037830241300000417
Figure FDA00037830241300000418
wherein T represents the total time, T represents the time index, K represents the index of the query map, K represents the total number of query maps,
Figure FDA0003783024130000051
representing a certain feature in the neighborhood of t, I t Characteristic of the t-th moment, p p Representing query mappings
Figure FDA0003783024130000052
P is a query mapping
Figure FDA0003783024130000053
The index of the most recent item of (c),
Figure FDA0003783024130000054
representing the weight of the mth memory item, M representing the index of the memory item, M representing the total number of memory items, p n Representing query mappings
Figure FDA0003783024130000055
The second most recent memory entry of (2).
CN202210934696.XA 2022-08-04 2022-08-04 Pig abnormal behavior detection method Pending CN115359511A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210934696.XA CN115359511A (en) 2022-08-04 2022-08-04 Pig abnormal behavior detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210934696.XA CN115359511A (en) 2022-08-04 2022-08-04 Pig abnormal behavior detection method

Publications (1)

Publication Number Publication Date
CN115359511A true CN115359511A (en) 2022-11-18

Family

ID=84033479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210934696.XA Pending CN115359511A (en) 2022-08-04 2022-08-04 Pig abnormal behavior detection method

Country Status (1)

Country Link
CN (1) CN115359511A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102646871B1 (en) * 2023-01-31 2024-03-13 한국축산데이터 주식회사 Apparatus and method for detecting environmental change using barn monitoring camera

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102646871B1 (en) * 2023-01-31 2024-03-13 한국축산데이터 주식회사 Apparatus and method for detecting environmental change using barn monitoring camera

Similar Documents

Publication Publication Date Title
Yang et al. Deep learning for smart fish farming: applications, opportunities and challenges
Chouhan et al. Soft computing approaches for image segmentation: a survey
Zhou et al. Evaluation of fish feeding intensity in aquaculture using a convolutional neural network and machine vision
JP6935377B2 (en) Systems and methods for automatic inference of changes in spatiotemporal images
Yadav et al. Lung-GANs: unsupervised representation learning for lung disease classification using chest CT and X-ray images
Gu et al. Segment 2D and 3D filaments by learning structured and contextual features
CN105654141A (en) Isomap and SVM algorithm-based overlooked herded pig individual recognition method
CN113470076A (en) Multi-target tracking method for yellow-feather chickens in flat-breeding henhouse
Hohberg Wildfire smoke detection using convolutional neural networks
Nagamani et al. Tomato leaf disease detection using deep learning techniques
He et al. What catches the eye? Visualizing and understanding deep saliency models
CN114399634B (en) Three-dimensional image classification method, system, equipment and medium based on weak supervision learning
Wang et al. Automated detection and counting of Artemia using U-shaped fully convolutional networks and deep convolutional networks
CN115359511A (en) Pig abnormal behavior detection method
Liu et al. Research progress of computer vision technology in abnormal fish detection
Chicchon et al. Semantic segmentation of fish and underwater environments using deep convolutional neural networks and learned active contours
CN117253192A (en) Intelligent system and method for silkworm breeding
Schwinn et al. Behind the machine's gaze: Neural networks with biologically-inspired constraints exhibit human-like visual attention
Jain et al. Incremental training for image classification of unseen objects
Samhitha et al. Dwarf Mongoose Optimization with Transfer Learning-Based Fish Behavior Classification Model
Ishwerlal et al. Lung disease classification using chest X ray image: An optimal ensemble of classification with hybrid training
Kurdi et al. Improved Qubit Neural Network Based Computer Aided Detection Model for COVID-19 on Chest Radiographs
US12039733B2 (en) Method for identifying individuals of oplegnathus punctatus based on convolutional neural network
US20210383149A1 (en) Method for identifying individuals of oplegnathus punctatus based on convolutional neural network
Kao A Deep Learning Architecture For Histology Image Classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination