GB2616316A - Neural network training technique - Google Patents
Neural network training technique Download PDFInfo
- Publication number
- GB2616316A GB2616316A GB2204314.5A GB202204314A GB2616316A GB 2616316 A GB2616316 A GB 2616316A GB 202204314 A GB202204314 A GB 202204314A GB 2616316 A GB2616316 A GB 2616316A
- Authority
- GB
- United Kingdom
- Prior art keywords
- neural network
- dataset
- image
- processor
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract 44
- 238000000034 method Methods 0.000 title claims abstract 10
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Train Traffic Observation, Control, And Security (AREA)
Abstract
Apparatuses, systems, and techniques to train a neural network to infer a condition based on an image. In at least one embodiment, a first portion of a neural network is trained to infer a condition from an image using a first dataset, and a second portion of the neural network is trained using a second dataset.
Claims (31)
- CLAIMS WHAT IS CLAIMED IS: 1. A processor, comprising: one or more circuits to train a first portion of a neural network using a first dataset and a second portion of the neural network using a second dataset.
- 2. The processor of claim 1, wherein the first and second portions of the neural network are trained in parallel, and wherein the second portion of the neural network is taught during training to provide a ground truth for training the first portion of the neural network.
- 3. The processor of claim 1, wherein the first and second portions of the neural network are trained in parallel to encode features of the first and second datasets to a shared latent space.
- 4. The processor of claim 1, wherein the first dataset comprises image data and the second dataset comprises textual descriptions of corresponding image data in the first dataset.
- 5. The processor of claim 1, the neural network comprising a cross-attention encoder, wherein a query input to the cross-attention encoder comprises output from the second portion of the neural network, and wherein key and value input to the cross-attention encoder comprises output from the first portion of the neural network.
- 6. The processor of claim 1, the neural network comprising a decoder to generate a saliency map based, at least in part, on output of a cross-attention encoder.
- 7. The processor of claim 1, wherein the first dataset comprises an image and the second dataset comprises a textual document, and wherein output of the neural network comprises a classification of a condition depicted in the image and described in the textual document.
- 8. The processor of claim 1, wherein output of the neural network comprises information identifying a condition depicted in an image
- 9. A system, comprising: one or more processors to train a first portion of a neural network using a first dataset and a second portion of the neural network using a second dataset
- 10. The system of claim 9, wherein the first and second portions of the neural network are trained in parallel, and wherein the second portion of the neural network is taught during training to provide information for training the first portion of the neural network
- 11. The system of claim 9, wherein the first and second portions of the neural network are trained in parallel to encode features of the first and second datasets to a shared latent space
- 12. The system of claim 9, wherein the first dataset comprises an image and the second dataset comprises a description of the image
- 13. The system of claim 9, wherein the neural network comprises a cross-attention encoder, wherein a query input to the cross-attention encoder comprises output from the second portion of the neural network, and wherein key and value input to the cross-attention encoder comprises output from the first portion of the neural network
- 14. The system of claim 9, the neural network comprising a decoder to generate information indicative of a region of an image
- 15. The system of claim 9, wherein output of the neural network comprises a classification of a condition depicted in an image
- 16. The system of claim 9, wherein the first dataset comprises a diagnostic image and the second dataset comprises a diagnostic report corresponding to the diagnostic image .
- 17. A processor comprising: one or more circuits to use a neural network to infer information about a first dataset based, at least in part, on a second dataset.
- 18. The processor of claim 17, wherein a first portion of the neural network is trained to encode features of image data in the first dataset and a second portion of the neural network is trained to encode features of textual data in the second dataset
- 19. The processor of claim 18, wherein the first portion of the neural network, and the second portion of the neural network, encode their respective inputs to a common latent space
- 20. The processor of claim 17, wherein the neural network is trained based, at least in part, on output of a cross-attention encoder using, as input to the cross-attention encoder, output of an image encoder and output of a language encoder
- 21. The processor of claim 17, wherein the first dataset comprises diagnostic images and the second dataset comprises diagnostic reports corresponding to the diagnostic images
- 22. The processor of claim 17, wherein the inferred information comprises information indicative of an area of interest in an image
- 23. The processor of claim 17, wherein a first portion of the neural network is trained to encode features of image data in the first dataset and a second portion of the neural network is trained to encode features of textual data in the second dataset, and wherein the first portion of the neural network, after training, is capable of inferring the information independently of the second portion
- 24. A method, comprising: training a neural network to diagnose a condition depicted in a diagnostic image, based at least in part on a first dataset comprising a set of diagnostic images and a second dataset comprising a set of diagnostic reports corresponding to diagnostic images in the set of diagnostic images .
- 25. The method of claim 24, wherein a first portion of the neural network is trained in parallel with a second portion of the neural network, and wherein the second portion of the neural network is trained to encode features of the diagnostic reports.
- 26. The method of claim 25, wherein the first and second portions of the neural network are trained to encode features of the first and second datasets to a shared latent space
- 27. The method of claim 24, further comprising: providing, as input to a cross-attention encoder, a query input comprising output from a language encoder, and key and value input comprising output from an image encoder
- 28. The method of claim 24, further comprising: training a language encoder of the neural network to encode features of the diagnostic reports to a latent space shared with output of an image encoder
- 29. The method of claim 24, further comprising: decoding output of an encoder to generate information summarizing the condition
- 30. The method of claim 24, wherein the neural network comprises a decoder to generate information indicative of a region in the diagnostic image that depicts the condition .
- 31. The method of claim 24, wherein diagnoses of the condition comprises identifying one or more categories of conditions determined, by the neural network, to be associated with a region of the diagnostic image.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2022/018217 WO2022187167A1 (en) | 2021-03-01 | 2022-02-28 | Neural network training technique |
Publications (2)
Publication Number | Publication Date |
---|---|
GB202204314D0 GB202204314D0 (en) | 2022-05-11 |
GB2616316A true GB2616316A (en) | 2023-09-06 |
Family
ID=81449445
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB2204314.5A Pending GB2616316A (en) | 2022-02-28 | 2022-02-28 | Neural network training technique |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2616316A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180350459A1 (en) * | 2017-06-05 | 2018-12-06 | University Of Florida Research Foundation, Inc. | Methods and apparatuses for implementing a semantically and visually interpretable medical diagnosis network |
CN111985369A (en) * | 2020-08-07 | 2020-11-24 | 西北工业大学 | Course field multi-modal document classification method based on cross-modal attention convolution neural network |
-
2022
- 2022-02-28 GB GB2204314.5A patent/GB2616316A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180350459A1 (en) * | 2017-06-05 | 2018-12-06 | University Of Florida Research Foundation, Inc. | Methods and apparatuses for implementing a semantically and visually interpretable medical diagnosis network |
CN111985369A (en) * | 2020-08-07 | 2020-11-24 | 西北工业大学 | Course field multi-modal document classification method based on cross-modal attention convolution neural network |
Non-Patent Citations (3)
Title |
---|
RIDDHISH BHALODIA ET AL,"Improving Pneumonia Localization via Cross-Attention on Medical Images and Reports", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, (2021-10-06), the whole document * |
WEI XI ET AL,"Multi-Modality Cross Attention Network for Image and Sentence Matching",2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, (2020-06-13), pages 10938-10947, doi:10.1109/CVPR42600 .2020.01095, [ret on 2020-08-03] pg 10938 pg 10945, right-hand column, para 1 * |
XIAOSONG WANG ET AL,"TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays", ARXIV. ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, (2018-01-12), the whole document * |
Also Published As
Publication number | Publication date |
---|---|
GB202204314D0 (en) | 2022-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2019200270B2 (en) | Concept mask: large-scale segmentation from semantic concepts | |
US10817521B2 (en) | Near-real-time prediction, classification, and notification of events in natural language systems | |
US20190318099A1 (en) | Using Gradients to Detect Backdoors in Neural Networks | |
US11640527B2 (en) | Near-zero-cost differentially private deep learning with teacher ensembles | |
KR102011788B1 (en) | Visual Question Answering Apparatus Using Hierarchical Visual Feature and Method Thereof | |
WO2021051497A1 (en) | Pulmonary tuberculosis determination method and apparatus, computer device, and storage medium | |
CN112400187A (en) | Knockout autoencoder for detecting anomalies in biomedical images | |
US20220230061A1 (en) | Modality adaptive information retrieval | |
US11853706B2 (en) | Generative language model for few-shot aspect-based sentiment analysis | |
US20200321101A1 (en) | Rule out accuracy for detecting findings of interest in images | |
CN116468746B (en) | Bidirectional copy-paste semi-supervised medical image segmentation method | |
US20230281390A1 (en) | Systems and methods for enhanced review comprehension using domain-specific knowledgebases | |
JP2019511797A (en) | INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND PROGRAM | |
CN113780365B (en) | Sample generation method and device | |
CN116844731A (en) | Disease classification method, disease classification device, electronic device, and storage medium | |
US11113466B1 (en) | Generating sentiment analysis of content | |
Wang et al. | SERR‐U‐Net: Squeeze‐and‐Excitation Residual and Recurrent Block‐Based U‐Net for Automatic Vessel Segmentation in Retinal Image | |
Patel et al. | PTXNet: An extended UNet model based segmentation of pneumothorax from chest radiography images | |
GB2616316A (en) | Neural network training technique | |
TWI742312B (en) | Machine learning system, machine learning method and non-transitory computer readable medium for operating the same | |
US20220027688A1 (en) | Image identification device, method for performing semantic segmentation, and storage medium | |
Geldenhuys et al. | Deep learning approaches to landmark detection in tsetse wing images | |
CN112785001B (en) | Artificial intelligence educational back-province robot for overcoming discrimination and prejudice | |
Ahn et al. | MMRR: Unsupervised Anomaly Detection through Multi-Level Masking and Restoration with Refinement | |
Ishikawa et al. | Saliency prediction based on object recognition and gaze analysis |