CN117828087A - LLM-based medical instrument data classification method and system - Google Patents

LLM-based medical instrument data classification method and system Download PDF

Info

Publication number
CN117828087A
CN117828087A CN202410022789.4A CN202410022789A CN117828087A CN 117828087 A CN117828087 A CN 117828087A CN 202410022789 A CN202410022789 A CN 202410022789A CN 117828087 A CN117828087 A CN 117828087A
Authority
CN
China
Prior art keywords
medical instrument
data
classification
model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410022789.4A
Other languages
Chinese (zh)
Other versions
CN117828087B (en
Inventor
金震
张京日
万俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SunwayWorld Science and Technology Co Ltd
Original Assignee
Beijing SunwayWorld Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SunwayWorld Science and Technology Co Ltd filed Critical Beijing SunwayWorld Science and Technology Co Ltd
Priority to CN202410022789.4A priority Critical patent/CN117828087B/en
Publication of CN117828087A publication Critical patent/CN117828087A/en
Application granted granted Critical
Publication of CN117828087B publication Critical patent/CN117828087B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Epidemiology (AREA)
  • General Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Public Health (AREA)
  • Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a medical instrument data classification method and a medical instrument data classification system based on LLM, which are characterized in that standard medical instrument data is subjected to format conversion based on a training data format to obtain medical instrument training data, an accurate data basis is provided for model training based on the medical instrument training data, then semantic language learning processing of the medical instrument training data is realized through the LLM model based on the LLM model, classification accuracy and classification efficiency of a medical instrument classification model obtained through training are ensured, and then the medical instrument classification model is verified and evaluated through new instrument noun data to obtain an optimal medical instrument classification model, so that the accuracy of model classification is further ensured, and finally, medical instrument data to be classified is classified based on the optimal medical instrument classification model, so that efficient classification and accurate classification of the medical instrument data are realized, and convenience is provided for medical instrument users.

Description

LLM-based medical instrument data classification method and system
Technical Field
The invention relates to the technical field of medical instruments, in particular to a LLM-based medical instrument data classification method and system.
Background
Medical devices refer to tools, equipment, instruments, materials, etc. used in medical diagnosis, treatment, monitoring and prevention of disease. Medical devices can be classified into a plurality of categories according to their function and purpose. The medical apparatus has various product categories and varieties, and the number of the categories is more than 30 from the major categories, and the varieties are more than 3,000.
Knowing the classification of medical devices for an average person can help better identify and use medical devices, much less the importance of such basic knowledge to practitioners in the medical device-related industry. There are also increasing demands on medical and health techniques, products and architecture construction, including medical devices. The medical device industry has become the most rapidly developing and promising area of the health industry.
At present, the medical instrument classification basically has the problems of low classification efficiency and poor classification effect in the modes of character fuzzy matching, word segmentation retrieval, similarity and the like.
The technical scheme of the invention mainly solves the problems that the classification method according to the nouns of medical appliances has higher classification speed and higher accuracy than the traditional methods such as rules, search, machine learning and the like.
Disclosure of Invention
The invention provides a LLM-based medical instrument data classification method and system, which are used for solving the problems in the background technology.
A LLM-based medical device data classification method, comprising:
s1: based on medical instrument standard definition information, sorting medical instrument data to obtain standard medical instrument data, and based on a training data format, performing format conversion on the standard medical instrument data to obtain medical instrument training data;
s2: based on the LLM model, combining with medical instrument training data, training to obtain a medical instrument classification model;
s3: based on the new instrument noun data, evaluating the medical instrument classification model to obtain an optimal medical instrument classification model;
s4: classifying the medical instrument data to be classified based on the optimal medical instrument classification model.
Preferably, in the step S1, the medical device data is sorted based on medical device standard definition information to obtain standard medical device data, which includes:
acquiring a primary classification definition and a secondary classification definition of the medical instrument from medical instrument standard definition information, classifying medical instrument data based on the primary classification definition and the secondary classification definition, and determining a primary class name and a secondary class name of the medical instrument;
determining product description information and use information of the medical instrument based on the primary classification definition and the secondary classification definition;
marking the matched medical instrument based on the first class name, the second class name, the product description information and the application information to obtain medical instrument marking information;
and giving the medical instrument marking information with serial numbers and directory codes, and sorting the medical instrument marking information based on the serial numbers and the directory codes to obtain standard medical instrument data.
Preferably, in the step S1, format conversion is performed on standard medical instrument data based on a training data format to obtain medical instrument training data, including:
acquiring an initial data format of standard medical instrument data, and acquiring a target conversion rule from the initial data format to a training data format from the data conversion rule;
and configuring a corresponding format conversion template for the standard medical instrument data based on the target conversion rule, and performing format conversion on the standard medical instrument data based on the format conversion template to obtain medical instrument training data.
Preferably, in the step S2, training is performed based on the LLM model in combination with medical device training data to obtain a medical device classification model, including:
classifying the medical instrument training data based on the data types to obtain a plurality of groups of category training data, and classifying the medical instrument training data based on all definition information of the medical instrument to obtain a plurality of groups of definition training data;
determining a first training ending condition of each group of training data based on the class characteristics of the data class, and respectively carrying out model training on the initial LLM model based on each group of training data and combining the corresponding first training ending condition to obtain a first LLM model corresponding to each group of training data;
determining a second training ending condition of each group of defined training data based on the instrument name of the medical instrument, and respectively carrying out model training on the initial LLM model based on each group of defined training data and combining the corresponding second training ending condition to obtain a second LLM model corresponding to each group of defined training data;
obtaining a first language analysis text of the medical instrument training data based on the first LMM model, and obtaining a second language analysis text of the medical instrument training data based on the second LMM model;
keyword retrieval is carried out on the first language analysis text based on the category characteristics to obtain first keywords, and keyword retrieval is carried out on the second language analysis text based on the definition characteristics to obtain second keywords;
obtaining a first key text from the first language analysis text based on the first key word, obtaining a second key text from the second language analysis text based on the second key word, and obtaining a target training text based on the first key text and the second key text;
the semantic correspondence between the target training text and the medical instrument training data is obtained, the initial classification model is trained based on the target training text, and the medical instrument classification model is obtained by combining the semantic correspondence.
Preferably, the training the initial classification model based on the target training text and combining the semantic correspondence to obtain the medical instrument classification model includes:
inputting the target training text into an initial classification model to obtain a first classification result;
inputting the training data of the medical instrument into an initial classification model to obtain a second classification result;
based on the first classification result and the semantic correspondence, the second classification result is evaluated until the evaluation result meets the preset classification requirement, and a medical instrument classification model is obtained.
Preferably, in the step S3, the evaluating the medical instrument classification model based on the new instrument noun data to obtain an optimal medical instrument classification model includes:
performing word segmentation processing on the new instrument noun data, and setting the recognition complexity of the new instrument noun data based on the word segmentation quantity and the word segmentation characteristics of the new instrument noun data;
dividing the new instrument noun data according to the identification complexity to obtain a plurality of new instrument noun data sets corresponding to the identification complexity range;
inputting the new instrument noun data sets into the medical instrument classification model respectively, and determining the classification accuracy of each new instrument noun data set;
determining a change curve of classification accuracy of a new instrument noun data set from a small recognition complexity range to a large recognition complexity range, determining the recognition accuracy of a feature extraction network of a medical instrument classification model for each recognition complexity range based on the change curve, and generalizing the medical instrument classification model;
determining network structure adjustment features and network node weight adjustment features of the feature extraction network based on the accuracy difference between the identification accuracy of the feature extraction network for each identification complexity range and the preset accuracy;
determining model complexity adjustment features of the medical instrument classification model based on generalization differences of generalization capability and preset generalization capability of the medical instrument classification model;
based on the network structure adjustment feature, the network node weight adjustment feature and the model complexity adjustment feature adjust the medical instrument classification model to obtain an optimal medical instrument classification model.
Preferably, the adjusting the medical instrument classification model based on the network structure adjusting feature, the network node weight adjusting feature and the model complexity adjusting feature to obtain an optimal medical instrument classification model includes:
based on the network structure adjustment feature, the network node weight adjustment feature and the model complexity adjustment feature adjust the medical instrument classification model to obtain an adjusted medical instrument classification model;
and reevaluating the adjusted medical instrument classification model until the evaluation result meets the classification requirement, and obtaining the final optimal medical instrument classification model.
Preferably, in the step S4, classifying the medical device data to be classified based on the optimal medical device classification model includes:
converting the data format of the medical instrument data to be classified to obtain target medical instrument data to be classified;
inputting the target medical instrument data to be classified into an optimal medical instrument classification model to obtain a classification result of the medical instrument data to be classified.
Preferably, inputting the target medical instrument data to be classified into an optimal medical instrument classification model to obtain a classification result of the medical instrument data to be classified, including:
inputting the target medical instrument data to be classified into an optimal medical instrument classification model, and obtaining an output marking result of the target medical instrument data to be classified from the model output result;
classifying the target medical instrument data to be classified based on the output marking result to obtain a classification result of the target medical instrument data to be classified.
A LLM based medical device data classification system, comprising:
the data processing module is used for sorting the medical instrument data based on the medical instrument standard definition information to obtain standard medical instrument data, and converting the standard medical instrument data based on the training data format to obtain medical instrument training data;
the model training module is used for combining the medical instrument training data based on the LLM model and training to obtain a medical instrument classification model;
the model evaluation module is used for evaluating the medical instrument classification model based on the new instrument noun data to obtain an optimal medical instrument classification model;
the data classification module is used for classifying the medical instrument data to be classified based on the optimal medical instrument classification model.
Compared with the prior art, the invention has the following beneficial effects:
sorting the medical instrument data based on medical instrument standard definition information to obtain standard medical instrument data, sorting the medical instrument data, carrying out format conversion on the standard medical instrument data based on a training data format to obtain medical instrument training data, providing an accurate data basis for model training based on the medical instrument training data, then, based on an LLM model, combining the medical instrument training data, training to obtain a medical instrument classification model, carrying out semantic language learning processing on the medical instrument training data through the LLM model, ensuring the classification accuracy and classification efficiency of the medical instrument classification model obtained by training, secondly, evaluating the medical instrument classification model based on new instrument noun data to obtain an optimal medical instrument classification model, verifying and evaluating the medical instrument classification model by utilizing the new instrument noun data to obtain the optimal medical instrument classification model, further ensuring the accuracy of model classification, and finally, classifying the medical instrument data to be classified based on the optimal medical instrument classification model, thereby realizing efficient classification and accurate classification of the medical instrument data and providing convenience for medical instrument users.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities particularly pointed out in the specification.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of a LLM-based medical device data classification method in accordance with an embodiment of the present invention;
FIG. 2 is a flow chart of acquiring standard medical device data in an embodiment of the invention;
fig. 3 is a block diagram of a LLM-based medical device data classification system in accordance with an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
Example 1:
based on embodiment 1, an embodiment of the present invention provides a method for classifying medical device data based on LLM, as shown in fig. 1, including:
s1: based on medical instrument standard definition information, sorting medical instrument data to obtain standard medical instrument data, and based on a training data format, performing format conversion on the standard medical instrument data to obtain medical instrument training data;
s2: based on the LLM model, combining with medical instrument training data, training to obtain a medical instrument classification model;
s3: based on the new instrument noun data, evaluating the medical instrument classification model to obtain an optimal medical instrument classification model;
s4: classifying the medical instrument data to be classified based on the optimal medical instrument classification model.
In this embodiment, the medical instrument standard definition information includes a primary classification, a secondary classification, a classification definition, an instrument use, an example, and the like.
In this embodiment, the data format of the medical device training data satisfies the input data requirements of the LLM model.
In this embodiment, the LLM model is a large language data model.
In this embodiment, the medical instrument classification model is evaluated, and the optimal medical instrument classification model is obtained, specifically, until the evaluation result of the medical instrument classification model meets the classification requirement, the finally determined model is used as the optimal medical instrument classification model.
In this embodiment, classifying the medical instrument data to be classified is specifically determining information such as primary classification, secondary classification, instrument use, and the like of the medical instrument.
The beneficial effects of above-mentioned design scheme are: sorting the medical instrument data based on medical instrument standard definition information to obtain standard medical instrument data, sorting the medical instrument data, carrying out format conversion on the standard medical instrument data based on a training data format to obtain medical instrument training data, providing an accurate data basis for model training based on the medical instrument training data, then, based on an LLM model, combining the medical instrument training data, training to obtain a medical instrument classification model, carrying out semantic language learning processing on the medical instrument training data through the LLM model, ensuring the classification accuracy and classification efficiency of the medical instrument classification model obtained by training, secondly, evaluating the medical instrument classification model based on new instrument noun data to obtain an optimal medical instrument classification model, verifying and evaluating the medical instrument classification model by utilizing the new instrument noun data to obtain the optimal medical instrument classification model, further ensuring the accuracy of model classification, and finally, classifying the medical instrument data to be classified based on the optimal medical instrument classification model, thereby realizing efficient classification and accurate classification of the medical instrument data and providing convenience for medical instrument users.
Example 2:
based on embodiment 1, the embodiment of the invention provides a medical instrument data classification method based on LLM, in S1, medical instrument data is sorted based on medical instrument standard definition information to obtain standard medical instrument data, which includes:
acquiring a primary classification definition and a secondary classification definition of the medical instrument from medical instrument standard definition information, classifying medical instrument data based on the primary classification definition and the secondary classification definition, and determining a primary class name and a secondary class name of the medical instrument;
determining product description information and use information of the medical instrument based on the primary classification definition and the secondary classification definition;
marking the matched medical instrument based on the first class name, the second class name, the product description information and the application information to obtain medical instrument marking information;
and giving the medical instrument marking information with serial numbers and directory codes, and sorting the medical instrument marking information based on the serial numbers and the directory codes to obtain standard medical instrument data.
The beneficial effects of above-mentioned design scheme are: medical instrument data are sorted based on medical instrument standard definition information to obtain standard medical instrument data, classification and sorting of the medical instrument data are achieved, and a basis is provided for further acquisition of medical instrument training data.
Example 3:
based on embodiment 1, an embodiment of the present invention provides a method for classifying medical device data based on LLM, as shown in fig. 2, in S1, format conversion is performed on standard medical device data based on a training data format, to obtain medical device training data, including:
acquiring an initial data format of standard medical instrument data, and acquiring a target conversion rule from the initial data format to a training data format from the data conversion rule;
and configuring a corresponding format conversion template for the standard medical instrument data based on the target conversion rule, and performing format conversion on the standard medical instrument data based on the format conversion template to obtain medical instrument training data.
In this embodiment, the data conversion rule is preset according to the conversion between data formats.
The beneficial effects of above-mentioned design scheme are: the standard medical instrument data is subjected to format conversion based on the training data format to obtain medical instrument training data, and an accurate data basis is provided for model training based on the medical instrument training data.
Example 4:
based on embodiment 1, the embodiment of the invention provides a medical instrument data classification method based on LLM, in S2, based on LLM model, the medical instrument classification model is obtained by combining training data of medical instrument, comprising:
classifying the medical instrument training data based on the data types to obtain a plurality of groups of category training data, and classifying the medical instrument training data based on all definition information of the medical instrument to obtain a plurality of groups of definition training data;
determining a first training ending condition of each group of training data based on the class characteristics of the data class, and respectively carrying out model training on the initial LLM model based on each group of training data and combining the corresponding first training ending condition to obtain a first LLM model corresponding to each group of training data;
determining a second training ending condition of each group of defined training data based on the instrument name of the medical instrument, and respectively carrying out model training on the initial LLM model based on each group of defined training data and combining the corresponding second training ending condition to obtain a second LLM model corresponding to each group of defined training data;
obtaining a first language analysis text of the medical instrument training data based on the first LMM model, and obtaining a second language analysis text of the medical instrument training data based on the second LMM model;
keyword retrieval is carried out on the first language analysis text based on the category characteristics to obtain first keywords, and keyword retrieval is carried out on the second language analysis text based on the definition characteristics to obtain second keywords;
obtaining a first key text from the first language analysis text based on the first key word, obtaining a second key text from the second language analysis text based on the second key word, and obtaining a target training text based on the first key text and the second key text;
the semantic correspondence between the target training text and the medical instrument training data is obtained, the initial classification model is trained based on the target training text, and the medical instrument classification model is obtained by combining the semantic correspondence.
In this embodiment, the set of class training data corresponds to, for example, any one of a primary class name, a secondary class name, product description information, usage information, and the like.
In this embodiment, the set of definition training data is all definition information of a certain medical device, such as all information of a primary class name, a secondary class name, product description information, and usage information of the medical device.
In this embodiment, the first LLM model focuses on semantic language learning of the category training data and the second LLM model focuses on semantic language learning of the definition training data.
In the embodiment, the target training text meets the data semantic language learning of the category training data and the definition training data, and effective recognition of the medical instrument training data can be better realized.
The beneficial effects of above-mentioned design scheme are: model training is carried out on the initial LLM model based on two aspects of category training data and definition training data, the language semantics of the obtained target training text is improved, and compared with the training of the medical instrument classification model by directly training the medical instrument training data, the training of the medical instrument classification model by the target training text is more accurate and efficient, and a model foundation is provided for classifying the medical instrument data.
Example 5:
based on embodiment 4, the embodiment of the invention provides a medical instrument data classification method based on LLM, wherein the initial classification model is trained based on target training text, and a medical instrument classification model is obtained by combining semantic correspondence, and the method comprises the following steps:
inputting the target training text into an initial classification model to obtain a first classification result;
inputting the training data of the medical instrument into an initial classification model to obtain a second classification result;
based on the first classification result and the semantic correspondence, the second classification result is evaluated until the evaluation result meets the preset classification requirement, and a medical instrument classification model is obtained.
The beneficial effects of above-mentioned design scheme are: compared with the training of the medical instrument classification model by directly training the medical instrument data, the training of the medical instrument classification model by the target training text is more accurate and efficient, and a model foundation is provided for classifying the medical instrument data.
Example 6:
based on embodiment 1, the embodiment of the present invention provides a medical device data classification method based on LLM, in S3, based on new device noun data, the medical device classification model is evaluated, and an optimal medical device classification model is obtained, including:
performing word segmentation processing on the new instrument noun data, and setting the recognition complexity of the new instrument noun data based on the word segmentation quantity and the word segmentation characteristics of the new instrument noun data;
dividing the new instrument noun data according to the identification complexity to obtain a plurality of new instrument noun data sets corresponding to the identification complexity range;
inputting the new instrument noun data sets into the medical instrument classification model respectively, and determining the classification accuracy of each new instrument noun data set;
determining a change curve of classification accuracy of a new instrument noun data set from a small recognition complexity range to a large recognition complexity range, determining the recognition accuracy of a feature extraction network of a medical instrument classification model for each recognition complexity range based on the change curve, and generalizing the medical instrument classification model;
determining network structure adjustment features and network node weight adjustment features of the feature extraction network based on the accuracy difference between the identification accuracy of the feature extraction network for each identification complexity range and the preset accuracy;
determining model complexity adjustment features of the medical instrument classification model based on generalization differences of generalization capability and preset generalization capability of the medical instrument classification model;
based on the network structure adjustment feature, the network node weight adjustment feature and the model complexity adjustment feature adjust the medical instrument classification model to obtain an optimal medical instrument classification model.
In this embodiment, the network structure adjustment feature, the determination of the network node weight adjustment feature and the model complexity adjustment feature are both determined from a predetermined relationship between the differences and the adjustments.
The beneficial effects of above-mentioned design scheme are: dividing the new instrument noun data according to the identification complexity to obtain a plurality of new instrument noun data sets corresponding to the identification complexity range; inputting the new instrument noun data sets into the medical instrument classification model respectively, and determining the classification accuracy of each new instrument noun data set; determining a change curve of classification accuracy of a new instrument noun data set from small to large in recognition complexity range, determining the recognition accuracy of a characteristic extraction network of a medical instrument classification model for each recognition complexity range and generalization capability of the medical instrument classification model based on the change curve, analyzing the accuracy of the medical instrument classification model from a plurality of new instrument noun data angles, further adjusting model parameters, ensuring superiority of the obtained optimal medical instrument classification model, and providing a model foundation for accurate data classification.
Example 7:
based on embodiment 6, the embodiment of the invention provides a medical instrument data classification method based on LLM, wherein the medical instrument classification model is adjusted based on network structure adjustment features, network node weight adjustment features and model complexity adjustment features to obtain an optimal medical instrument classification model, and the method comprises the following steps:
based on the network structure adjustment feature, the network node weight adjustment feature and the model complexity adjustment feature adjust the medical instrument classification model to obtain an adjusted medical instrument classification model;
and reevaluating the adjusted medical instrument classification model until the evaluation result meets the classification requirement, and obtaining the final optimal medical instrument classification model.
In this embodiment, the re-evaluation of the adjusted medical instrument classification model is performed in the same manner as the first evaluation of the medical instrument classification model.
The beneficial effects of above-mentioned design scheme are: and the final optimal medical instrument classification model is obtained through multiple evaluation and adjustment, so that the superiority of the obtained optimal medical instrument classification model is ensured, and a model foundation is provided for accurate data classification.
Example 8:
based on embodiment 1, the embodiment of the invention provides a medical instrument data classification method based on LLM, in S4, classifying medical instrument data to be classified based on an optimal medical instrument classification model includes:
converting the data format of the medical instrument data to be classified to obtain target medical instrument data to be classified;
inputting the target medical instrument data to be classified into an optimal medical instrument classification model to obtain a classification result of the medical instrument data to be classified.
The beneficial effects of above-mentioned design scheme are: based on the optimal medical instrument classification model, classifying the medical instrument data to be classified, realizing efficient classification and accurate classification of the medical instrument data, and providing convenience for medical instrument users.
Example 9:
based on embodiment 8, the embodiment of the invention provides a medical instrument data classification method based on LLM, which inputs target medical instrument data to be classified into an optimal medical instrument classification model to obtain a classification result of the medical instrument data to be classified, comprising the following steps:
inputting the target medical instrument data to be classified into an optimal medical instrument classification model, and obtaining an output marking result of the target medical instrument data to be classified from the model output result;
classifying the target medical instrument data to be classified based on the output marking result to obtain a classification result of the target medical instrument data to be classified.
The beneficial effects of above-mentioned design scheme are: based on the optimal medical instrument classification model, classifying the medical instrument data to be classified, realizing efficient classification and accurate classification of the medical instrument data, and providing convenience for medical instrument users.
Example 10:
an embodiment of the present invention provides a LLM-based medical device data classification system, as shown in FIG. 3, including:
the data processing module is used for sorting the medical instrument data based on the medical instrument standard definition information to obtain standard medical instrument data, and converting the standard medical instrument data based on the training data format to obtain medical instrument training data;
the model training module is used for combining the medical instrument training data based on the LLM model and training to obtain a medical instrument classification model;
the model evaluation module is used for evaluating the medical instrument classification model based on the new instrument noun data to obtain an optimal medical instrument classification model;
the data classification module is used for classifying the medical instrument data to be classified based on the optimal medical instrument classification model.
In this embodiment, the data format of the medical device training data satisfies the input data requirements of the LLM model.
In this embodiment, the LLM model is a large language data model.
In this embodiment, the medical instrument classification model is evaluated, and the optimal medical instrument classification model is obtained, specifically, until the evaluation result of the medical instrument classification model meets the classification requirement, the finally determined model is used as the optimal medical instrument classification model.
In this embodiment, classifying the medical instrument data to be classified is specifically determining information such as primary classification, secondary classification, instrument use, and the like of the medical instrument.
The beneficial effects of above-mentioned design scheme are: sorting the medical instrument data based on medical instrument standard definition information to obtain standard medical instrument data, sorting the medical instrument data, carrying out format conversion on the standard medical instrument data based on a training data format to obtain medical instrument training data, providing an accurate data basis for model training based on the medical instrument training data, then, based on an LLM model, combining the medical instrument training data, training to obtain a medical instrument classification model, carrying out semantic language learning processing on the medical instrument training data through the LLM model, ensuring the classification accuracy and classification efficiency of the medical instrument classification model obtained by training, secondly, evaluating the medical instrument classification model based on new instrument noun data to obtain an optimal medical instrument classification model, verifying and evaluating the medical instrument classification model by utilizing the new instrument noun data to obtain the optimal medical instrument classification model, further ensuring the accuracy of model classification, and finally, classifying the medical instrument data to be classified based on the optimal medical instrument classification model, thereby realizing efficient classification and accurate classification of the medical instrument data and providing convenience for medical instrument users.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the present application and the equivalent techniques, the present invention is intended to include such modifications and variations.

Claims (10)

1. A LLM-based medical device data classification method, comprising:
s1: based on medical instrument standard definition information, sorting medical instrument data to obtain standard medical instrument data, and based on a training data format, performing format conversion on the standard medical instrument data to obtain medical instrument training data;
s2: based on the LLM model, combining with medical instrument training data, training to obtain a medical instrument classification model;
s3: based on the new instrument noun data, evaluating the medical instrument classification model to obtain an optimal medical instrument classification model;
s4: classifying the medical instrument data to be classified based on the optimal medical instrument classification model.
2. The LLM-based medical device data classification method according to claim 1, wherein in S1, medical device data is sorted based on medical device standard definition information to obtain standard medical device data, comprising:
acquiring a primary classification definition and a secondary classification definition of the medical instrument from medical instrument standard definition information, classifying medical instrument data based on the primary classification definition and the secondary classification definition, and determining a primary class name and a secondary class name of the medical instrument;
determining product description information and use information of the medical instrument based on the primary classification definition and the secondary classification definition;
marking the matched medical instrument based on the first class name, the second class name, the product description information and the application information to obtain medical instrument marking information;
and giving the medical instrument marking information with serial numbers and directory codes, and sorting the medical instrument marking information based on the serial numbers and the directory codes to obtain standard medical instrument data.
3. The LLM-based medical device data classification method of claim 1, wherein in S1, format conversion is performed on standard medical device data based on a training data format to obtain medical device training data, comprising:
acquiring an initial data format of standard medical instrument data, and acquiring a target conversion rule from the initial data format to a training data format from the data conversion rule;
and configuring a corresponding format conversion template for the standard medical instrument data based on the target conversion rule, and performing format conversion on the standard medical instrument data based on the format conversion template to obtain medical instrument training data.
4. The LLM-based medical device data classification method of claim 1, wherein in S2, training is performed based on LLM model and combined with medical device training data to obtain a medical device classification model, comprising:
classifying the medical instrument training data based on the data types to obtain a plurality of groups of category training data, and classifying the medical instrument training data based on all definition information of the medical instrument to obtain a plurality of groups of definition training data;
determining a first training ending condition of each group of training data based on the class characteristics of the data class, and respectively carrying out model training on the initial LLM model based on each group of training data and combining the corresponding first training ending condition to obtain a first LLM model corresponding to each group of training data;
determining a second training ending condition of each group of defined training data based on the instrument name of the medical instrument, and respectively carrying out model training on the initial LLM model based on each group of defined training data and combining the corresponding second training ending condition to obtain a second LLM model corresponding to each group of defined training data;
obtaining a first language analysis text of the medical instrument training data based on the first LMM model, and obtaining a second language analysis text of the medical instrument training data based on the second LMM model;
keyword retrieval is carried out on the first language analysis text based on the category characteristics to obtain first keywords, and keyword retrieval is carried out on the second language analysis text based on the definition characteristics to obtain second keywords;
obtaining a first key text from the first language analysis text based on the first key word, obtaining a second key text from the second language analysis text based on the second key word, and obtaining a target training text based on the first key text and the second key text;
the semantic correspondence between the target training text and the medical instrument training data is obtained, the initial classification model is trained based on the target training text, and the medical instrument classification model is obtained by combining the semantic correspondence.
5. The LLM-based medical device data classification method of claim 4, wherein training the initial classification model based on the target training text, combining with semantic correspondence, and obtaining the medical device classification model comprises:
inputting the target training text into an initial classification model to obtain a first classification result;
inputting the training data of the medical instrument into an initial classification model to obtain a second classification result;
based on the first classification result and the semantic correspondence, the second classification result is evaluated until the evaluation result meets the preset classification requirement, and a medical instrument classification model is obtained.
6. The LLM-based medical device data classification method of claim 1, wherein in S3, the medical device classification model is evaluated based on new device noun data to obtain an optimal medical device classification model, comprising:
performing word segmentation processing on the new instrument noun data, and setting the recognition complexity of the new instrument noun data based on the word segmentation quantity and the word segmentation characteristics of the new instrument noun data;
dividing the new instrument noun data according to the identification complexity to obtain a plurality of new instrument noun data sets corresponding to the identification complexity range;
inputting the new instrument noun data sets into the medical instrument classification model respectively, and determining the classification accuracy of each new instrument noun data set;
determining a change curve of classification accuracy of a new instrument noun data set from a small recognition complexity range to a large recognition complexity range, determining the recognition accuracy of a feature extraction network of a medical instrument classification model for each recognition complexity range based on the change curve, and generalizing the medical instrument classification model;
determining network structure adjustment features and network node weight adjustment features of the feature extraction network based on the accuracy difference between the identification accuracy of the feature extraction network for each identification complexity range and the preset accuracy;
determining model complexity adjustment features of the medical instrument classification model based on generalization differences of generalization capability and preset generalization capability of the medical instrument classification model;
based on the network structure adjustment feature, the network node weight adjustment feature and the model complexity adjustment feature adjust the medical instrument classification model to obtain an optimal medical instrument classification model.
7. The LLM-based medical device data classification method of claim 6, wherein the adjusting the medical device classification model based on the network structure adjustment feature, the network node weight adjustment feature, and the model complexity adjustment feature to obtain an optimal medical device classification model comprises:
based on the network structure adjustment feature, the network node weight adjustment feature and the model complexity adjustment feature adjust the medical instrument classification model to obtain an adjusted medical instrument classification model;
and reevaluating the adjusted medical instrument classification model until the evaluation result meets the classification requirement, and obtaining the final optimal medical instrument classification model.
8. The LLM-based medical device data classification method as set forth in claim 1, wherein in S4, classifying the medical device data to be classified based on the optimal medical device classification model includes:
converting the data format of the medical instrument data to be classified to obtain target medical instrument data to be classified;
inputting the target medical instrument data to be classified into an optimal medical instrument classification model to obtain a classification result of the medical instrument data to be classified.
9. The LLM-based medical device data classification method of claim 8, wherein inputting target medical device data to be classified into an optimal medical device classification model to obtain a classification result of the medical device data to be classified comprises:
inputting the target medical instrument data to be classified into an optimal medical instrument classification model, and obtaining an output marking result of the target medical instrument data to be classified from the model output result;
classifying the target medical instrument data to be classified based on the output marking result to obtain a classification result of the target medical instrument data to be classified.
10. A LLM based medical device data classification system, comprising:
the data processing module is used for sorting the medical instrument data based on the medical instrument standard definition information to obtain standard medical instrument data, and converting the standard medical instrument data based on the training data format to obtain medical instrument training data;
the model training module is used for combining the medical instrument training data based on the LLM model and training to obtain a medical instrument classification model;
the model evaluation module is used for evaluating the medical instrument classification model based on the new instrument noun data to obtain an optimal medical instrument classification model;
the data classification module is used for classifying the medical instrument data to be classified based on the optimal medical instrument classification model.
CN202410022789.4A 2024-01-08 2024-01-08 LLM-based medical instrument data classification method and system Active CN117828087B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410022789.4A CN117828087B (en) 2024-01-08 2024-01-08 LLM-based medical instrument data classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410022789.4A CN117828087B (en) 2024-01-08 2024-01-08 LLM-based medical instrument data classification method and system

Publications (2)

Publication Number Publication Date
CN117828087A true CN117828087A (en) 2024-04-05
CN117828087B CN117828087B (en) 2024-07-09

Family

ID=90505753

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410022789.4A Active CN117828087B (en) 2024-01-08 2024-01-08 LLM-based medical instrument data classification method and system

Country Status (1)

Country Link
CN (1) CN117828087B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140278353A1 (en) * 2013-03-13 2014-09-18 Crimson Hexagon, Inc. Systems and Methods for Language Classification
CN114357108A (en) * 2021-11-25 2022-04-15 达而观数据(成都)有限公司 Medical text classification method based on semantic template and language model
CN116450829A (en) * 2023-05-06 2023-07-18 平安科技(深圳)有限公司 Medical text classification method, device, equipment and medium
CN116796857A (en) * 2023-06-30 2023-09-22 平安科技(深圳)有限公司 LLM model training method, device, equipment and storage medium thereof
CN117151088A (en) * 2023-09-21 2023-12-01 腾讯科技(深圳)有限公司 Text processing method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140278353A1 (en) * 2013-03-13 2014-09-18 Crimson Hexagon, Inc. Systems and Methods for Language Classification
CN114357108A (en) * 2021-11-25 2022-04-15 达而观数据(成都)有限公司 Medical text classification method based on semantic template and language model
CN116450829A (en) * 2023-05-06 2023-07-18 平安科技(深圳)有限公司 Medical text classification method, device, equipment and medium
CN116796857A (en) * 2023-06-30 2023-09-22 平安科技(深圳)有限公司 LLM model training method, device, equipment and storage medium thereof
CN117151088A (en) * 2023-09-21 2023-12-01 腾讯科技(深圳)有限公司 Text processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN117828087B (en) 2024-07-09

Similar Documents

Publication Publication Date Title
US5671333A (en) Training apparatus and method
Briliani et al. Hate speech detection in indonesian language on instagram comment section using K-nearest neighbor classification method
CN101097570A (en) Advertisement classification method capable of automatic recognizing classified advertisement type
CN113076411B (en) Medical query expansion method based on knowledge graph
CN103593425A (en) Intelligent retrieval method and system based on preference
CN108986907A (en) A kind of tele-medicine based on KNN algorithm divides the method for examining automatically
CN115098650B (en) Comment information analysis method based on historical data model and related device
CN115982473B (en) Public opinion analysis arrangement system based on AIGC
CN110472257A (en) A kind of MT engine assessment preferred method and system based on sentence pair
CN113821587B (en) Text relevance determining method, model training method, device and storage medium
CN112818120B (en) Problem labeling method and device, storage medium and electronic equipment
CN110347701A (en) A kind of target type identification method of entity-oriented retrieval and inquisition
CN110472256A (en) A kind of MT engine assessment preferred method and system based on chapter
CN111221968A (en) Author disambiguation method and device based on subject tree clustering
CN116842194A (en) Electric power semantic knowledge graph system and method
CN115146062A (en) Intelligent event analysis method and system fusing expert recommendation and text clustering
JP3333998B2 (en) Automatic classifying apparatus and method
CN113032573B (en) Large-scale text classification method and system combining topic semantics and TF-IDF algorithm
CN117763126A (en) Knowledge retrieval method, device, storage medium and apparatus
JP2009003814A (en) Method and system for answering question
Priandini et al. Categorizing document by fuzzy C-Means and K-nearest neighbors approach
CN117828087B (en) LLM-based medical instrument data classification method and system
CN115269816A (en) Core personnel mining method and device based on information processing method and storage medium
CN114969375A (en) Method and system for giving artificial intelligence learning to machine based on psychological knowledge
CN111428002B (en) Natural language man-machine interactive intelligent question-answering implementation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant