CN117574962B

CN117574962B - Semiconductor chip detection method and device based on transfer learning and storage medium

Info

Publication number: CN117574962B
Application number: CN202311314162.8A
Authority: CN
Inventors: 张贵阳; 陈涛; 曹葵康
Original assignee: Tztek Technology Co Ltd
Current assignee: Tztek Technology Co Ltd
Priority date: 2023-10-11
Filing date: 2023-10-11
Publication date: 2024-06-25
Anticipated expiration: 2043-10-11
Also published as: CN117574962A

Abstract

The invention provides a semiconductor chip detection method based on transfer learning. The defect detection method provided by the invention fuses the advantages of the first network model and the second network model based on transfer learning; in particular, the advantages of convolutional neural networks and attention mechanisms are combined; and based on transfer learning, the weight parameters of the first network model and the second network model are learned first, and the weight parameters of the network are fine-tuned after the transfer learning is carried out on the models, so that the identification and extraction functions of the chip defects can be effectively completed under the condition of a small sample.

Description

Semiconductor chip detection method and device based on transfer learning and storage medium

Technical Field

The present invention relates to the field of computer processing technologies, and in particular, to a method and apparatus for detecting a semiconductor chip based on transfer learning, and a storage medium.

Background

The visual inspection technology has the advantages of non-contact, high precision and the like, and is widely applied to the field of product defect detection, in particular to the fields of manufacturing products such as automobiles, household appliances, electronic components, semiconductors and the like, equipment installation and the like, and the machine visual inspection technology plays an increasingly important role.

The current semiconductor chip is applied to various fields, has become a pulse for economic development and national information security, and has great significance for the development of the country. However, in the semiconductor chip packaging manufacturing process, the problems of fouling, abnormal dispensing, packaging defects and the like are unavoidable, and the operation function and the efficiency of the integrated circuit are directly affected. Therefore, the semiconductor chip production line is forced to identify and remove the defective chips in time, so that the purposes of improving the product quality and the production efficiency are achieved. Therefore, quality inspection of semiconductor chips plays a significant role in the production process of semiconductor chip package manufacture.

The traditional semiconductor chip defect detection technology based on the machine vision technology generally adopts modes such as digital image processing, artificial feature extraction operators and the like, and can detect and position defects to a certain extent, but has the defects of low detection rate, unstable detection effect and the like. The traditional machine vision algorithm is difficult to complete modeling and migration of defect characteristics, the design verification of a characteristic extraction operator is needed manually, the workload is large, the time cost is high, and certain difficulty is brought to rapid and accurate defect detection of products. Meanwhile, in the field of chip defect detection, because the chip production and processing flow is complex, the requirement on processing precision is extremely high, so how to rapidly and accurately detect the chip product defects becomes one of key factors for restricting the improvement of the chip yield and the quality of related enterprises. Because the defects are various, and the types and the number of the packaged chip defect samples which can be provided in the real industrial environment are not large, the chip image data sets of most industrial products have serious unbalance-like problems, and the application of the machine vision technology in the chip defect detection is restricted.

Disclosure of Invention

The embodiment of the invention provides a semiconductor chip detection method, a device and a storage medium based on transfer learning, which are based on the idea of transfer learning, improve and fuse a neural network model with excellent current performance, improve the problem of 'small samples' in the defect detection of an IC chip, realize the extraction of multidimensional complex characteristic information and achieve the aim of improving the performance of a semiconductor chip defect detection algorithm. The technical scheme is as follows:

In one aspect, a method for detecting a semiconductor chip based on transfer learning is provided, the method includes constructing a packaged chip defect detection neural network model and training the packaged chip defect detection neural network model;

the constructing the packaged chip defect detection neural network model comprises the following steps:

an initial first network model and an initial second network model are constructed,

Migrating the initial first network model based on the large-scale data set to obtain a migrated first network model; migrating the initial second network model based on the large-scale data set to obtain a migrated second network model;

Constructing a packaged chip defect detection neural network model by using the migration first network model and the migration second network model;

the training of the packaged chip defect detection neural network model comprises the following steps:

Dividing the chip defect data set into a training set and a data set according to a certain proportion, and training the defect detection neural network model to obtain the trained packaged chip defect detection neural network model.

Further, the method further comprises the following steps:

Training an initial first network model and an initial second network model by using a large-scale data set to acquire a pre-trained weight parameter;

and adjusting the weight parameters of the pre-training when training the defect detection neural network model.

Further, the migrating the initial first network model based on the large-scale data set to obtain a migrated first network model includes:

Training the initial first network model based on a large-scale data set, and reserving the first n structural layers of the first network model; migrating the front n structural layers to form a first network migration model;

retraining the m-n layer after migrating the first network model is required when training the packaged chip defect detection neural network model.

Further, the migrating the initial first network model based on the large-scale data set to obtain a migrated first network model, and further includes:

Training the initial first network model based on a large-scale data set, and reserving the first n structural layers of the first network model; the first n structural layers are divided into a plurality of network blocks, residual error structures are added in the network blocks, and convolution layers for changing the dimension of network output data are added in the last layer to form a migration first network model.

Further, when the migrating first network model and migrating second network model are used for constructing the packaged chip defect detection neural network model, the method further comprises:

The input of the migration first network model is a three-dimensional image, and the output is a feature map vector; the feature map vector comprises low-dimensional features of a chip defect picture;

The input of the migration second network model comprises the output of the migration first network model and the picture to be detected, and the output is the classification or grade of the chip defect picture.

Further, the first network model is Darknet-19 convolutional neural networks; the second network model is ViT neural networks.

Further, the method further comprises the following steps: and detecting the defects of the target packaged chip by using the packaged chip defect detection neural network model.

Further, the initial first network model is migrated based on the large-scale data set, and a migrated first network model is obtained; migrating the initial second network model based on the large-scale data set to obtain a migrated second network model,

Let Ω _s、Ω_t be the pre-training dataset and the new dataset, i.e. the source domain and the target domain, respectively, the datasets are composed of picture data and class labels, expressed as:

Ω_s＝{D_s,L_s},Ω_t＝{D_t,L_t},Ψ_s≠Ψ_s

Wherein D is picture data, L is a class label, ψ represents a data probability distribution model, subscripts s and t respectively represent a source domain and a target domain, and probability distributions of the source domain and the target domain are different;

the learning process of the source domain and the target domain by using the pre-training network is as follows:

O_s＝A(D_s;W_s),O_t＝A(D_t;W_t)

Wherein, A is a pre-training network model, O is the network model output, W is a weight parameter, and the pre-training process can be expressed as:

Wherein Γ is the loss function, The true classification result of the data set is obtained;

In the process of transfer learning, a part of the structure of the pre-trained network model is transferred into a new network structure, which is expressed as:

W_t{l₁～l₂}＝W_s{l₁～l₂}

Wherein l ₁、l₂ is the start layer and the end layer of the migration portion, respectively.

On the other hand, the invention also provides a semiconductor chip detection device based on transfer learning, which comprises:

the construction module is used for constructing a neural network model for detecting the defects of the packaged chip;

the training module is used for training the neural network model for detecting the defects of the packaged chip;

In another aspect, the present invention also provides a storage medium having stored thereon a computer program which, when read and executed by a processor, implements the transfer learning-based semiconductor chip detection method as set forth in any one of the above.

The invention has the beneficial effects that:

The defect detection method of the semiconductor IC packaging chip provided by the invention integrates the advantages of the first network model and the second network model based on transfer learning; in particular, the advantages of convolutional neural networks and attention mechanisms are combined; firstly, by utilizing the advantages of Darknet convolutional neural networks, low-dimensional data features are extracted, features such as outlines, corner points and the like in the packaged chip pictures are effectively identified, then the advantages of global features and high-dimensional features in the pictures are better captured by means of ViT neural network algorithm, and the detection and identification effects on the defects of the packaged chips are greatly improved by combining the features.

According to the semiconductor chip detection method based on the transfer learning, the weight parameters of the first network model and the second network model are learned firstly based on the transfer learning, and the weight parameters of the network are retrained after the transfer learning is carried out on the models, so that the identification and extraction functions of the chip defects can be effectively completed under the condition of a small sample.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, a brief description will be given below of the drawings required for the embodiments or the prior art descriptions, and it is obvious that the drawings in the following description are some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a model architecture of a semiconductor chip inspection method based on transfer learning according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a first network model in a semiconductor chip detection method based on transfer learning according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a second network model in a method for detecting a semiconductor chip based on transfer learning according to an embodiment of the present invention;

FIG. 4 is a flow chart of a neural network model for detecting defects of a packaged chip in a semiconductor chip detection method based on transfer learning according to an embodiment of the present invention;

Fig. 5 is a schematic flow chart of a semiconductor chip detection method based on transfer learning according to an embodiment of the invention;

FIG. 6 is a schematic diagram of a semiconductor chip inspection apparatus based on transfer learning according to an embodiment of the present invention;

FIG. 7 is a comparison of the defect detection experiment of the semiconductor chip of the present invention;

FIG. 8 is a schematic diagram of the operation of an electronic device of the present invention;

fig. 9 shows a schematic diagram of a storage medium according to an embodiment of the present disclosure.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The drawings are merely schematic illustrations of the present invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware forwarding modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.

Furthermore, the flow shown in the drawings is merely illustrative and not necessarily all steps are included. For example, some steps may be decomposed, some steps may be combined or partially combined, and the order of actual execution may be changed according to actual situations. The use of the terms "first," "second," and the like in the description herein do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. It should be noted that, without conflict, the embodiments of the present invention and features in different embodiments may be combined with each other.

With the continuous development of deep learning technology in recent years, the migration learning algorithm can effectively extract depth feature information of small sample data by learning models and weight parameters which are excellent in large-scale data set.

The invention is based on the idea of transfer learning, improves and fuses the neural network model with excellent current performance, improves the problem of 'small sample' in the defect detection of the IC chip, realizes the extraction of multi-dimensional complex characteristic information, and achieves the aim of improving the performance of the defect detection algorithm of the semiconductor chip.

In order to solve the above-mentioned problems, referring to fig. 1 to 5, in one aspect, there is provided a semiconductor chip detection method based on transfer learning, the method comprising:

S1, constructing a neural network model for detecting defects of a packaged chip; and S2, training a neural network model for detecting the defects of the packaged chip.

In addition, the method can further comprise S3, and the semiconductor chip is detected by using a packaged chip defect detection neural network model.

Specifically, the S1 construction of the packaged chip defect detection neural network model includes:

s11, constructing an initial first network model and an initial second network model.

S12, migrating the initial first network model based on the large-scale data set to obtain a migrated first network model; and migrating the initial second network model based on the large-scale data set to obtain a migrated second network model.

S13, constructing the encapsulated chip defect detection neural network model by using the migration first network model and the migration second network model.

Specifically, the S2 training package chip defect detection neural network model includes:

s21, dividing the chip defect data set into a training set and a data set according to a certain proportion, and training the defect detection neural network model to obtain a trained packaged chip defect detection neural network model.

It is understood that the initial first network model and the initial second network model may be neural network models that are pre-trained with large data sets and that perform well. For example, the first network model is Darknet-19 convolutional neural networks; the second network model is ViT neural networks. Both are network models that are pre-trained on the ImageNet dataset.

Darknet-19 have the characteristics of small structure, small parameter quantity and the like, and certain defects may exist relative to a large-scale neural network model. However, defects of the semiconductor package chip include scratches, stains, breakage and the like, which mainly comprise low-dimensional geometric features such as outlines, corner points and the like, and on the other hand, due to the small geometric dimensions of part of defects, the defects have high similarity with an environmental background, are difficult to be directly separated from the background, and global and high-dimensional features of the defects need to be mined. Darknet-19 convolutional neural network can better extract the advantage of low-dimensional data characteristics, and effectively identify the characteristics such as contours, corner points and the like in the packaged chip picture. In addition, the same types Darknet-53, YOLOv, v 4and v5 can be used according to the actual situation.

Based on the advantages and disadvantages of Darknet-19, the method can better capture the advantages of global and high-dimensional characteristics in the picture by means of the ViT neural network algorithm, and the combination of the global and high-dimensional characteristics and the advantage of the high-dimensional characteristics greatly improves the detection and identification efficiency of the defects of the packaged chip.

In step S12, it includes:

That is, a part of the neural network which is pretrained by the large data set and has excellent performance is used as a feature extractor of a new task, and on the basis, the structure design and training are carried out on the small sample data set, so that the small sample data features are effectively identified. Assuming a neural network with excellent performanceThe first n layers are migrated to a new neural network, the weight parameters of the new neural network are pre-trained, the new neural network is set as G, m layers are shared, and then the later m-n layers of the whole network need to be redesigned and trained, so that a network structure model for migration learning can be expressed as follows:

Where G' is the newly designed network model.

Preferably, training the initial first network model based on a large-scale dataset, retaining the first n structural layers of the first network model; the first n structural layers are divided into a plurality of network blocks, residual error structures are added in the network blocks, and convolution layers for changing the dimension of network output data are added in the last layer to form a migration first network model.

Specifically, the original Darknet-19 network mainly comprises structures such as 19 convolution layers, 5 maximum pooling layers and the like, has stronger feature extraction capability, and has the advantages of simple structure, small parameter quantity, high operation efficiency and the like. Aiming at chip defect detection, the invention reserves the first 13 convolution layers and 5 maximum pooling layers of Darknet-19 networks, which are divided into 5 network blocks, and adds residual error structures in the last 4 network blocks. The residual structure can enhance gradient propagation, simplify learning process and enhance network generalization capability. And a 1 x 1 convolution layer is added at the end of the network to change the output data dimension of the network, so that the network can be better fused with ViT neural networks, as shown in fig. 2.

The residual structure comprises a plurality of network layers, and the input and the output are associated through one jump connection, so that the neural network can learn more effectively, and the performance of the network is improved. Assuming that the input of one network layer is ζ and the output is F (ζ), the output R (ζ) of the residual structure is:

R(ξ)＝F(ξ)+ξ

After adding the residual structure, the output F (ζ) =r (ζ) - ζ of the network layer, that is, the learning original input signal is converted into a difference value of the learning signal.

The input of the improved Darknet-19 network is a 3-dimensional image with the size of 224 multiplied by 224, and the output is a feature image tensor with the size of 16 multiplied by 196, and the feature image tensor mainly comprises low-dimensional features such as outlines, corner points and the like of the packaged chip defect picture.

When the packaged chip defect detection neural network model is constructed in step S13,

Migrating the input of the first network model into a three-dimensional image, and outputting the three-dimensional image into a feature map vector; the feature map vector comprises low-dimensional features of a chip defect picture;

The ViT neural network structure in the invention is shown in figure 3, and comprises 4 encoders and 4 decoders, wherein the input of the 4 encoders comprises two parts, namely, an input picture to be detected, and the ViT neural network is utilized to have excellent learning ability on global and high-dimensional characteristics, so that the global and high-dimensional characteristics in the defects of the packaged chip are obtained; and secondly, outputting image features generated by the migration Darknet-19 network, wherein the features mainly comprise low-dimensional features such as outlines, corner points and the like of the packaged chip defect image samples. Through the common study of the two parts of data, the purposes of improving the defect detection precision and efficiency are achieved. Both parts of data are tensors with the size of 16 multiplied by 196, and input data with the size of 16 multiplied by 392 is formed after the tensors are spliced. The data stitching process is expressed as:

Wherein DN, DP are the output of the modified Darknet-19 network and the picture slice data, respectively. The input data is processed by Embedding, a multi-head attention mechanism, a full connection layer and the like in ViT networks, and finally a detection result is obtained.

The ViT neural network architecture introduces attention to the field of machine vision, which divides an input picture into slices of 16 x 16 size, captures the relationship between slices over a wider range by means of an attention mechanism, thereby enhancing feature extraction capability.

Further, the method further comprises: in the S1 construction step, training an initial first network model and an initial second network model by using a large-scale data set to acquire a pre-trained weight parameter;

In the step of S2 training, that is, when training the defect detection neural network model, the weight parameters of the pre-training are adjusted.

Specifically, migrating the initial first network model based on a large-scale data set to obtain a migrated first network model; migrating the initial second network model based on the large-scale data set to obtain a migrated second network model,

Ω_s＝{D_s,L_s},Ω_t＝{D_t,L_t},Ψ_s≠Ψ_s

O_s＝A(D_s;W_s),O_t＝A(D_t;W_t)

W_t{l₁～l₂}＝W_s{l₁～l₂}

The above-mentioned model migration learning process, namely the network model migration process across scenes.

The step S3 specifically comprises the following steps: and detecting the defects of the target packaged chip by using the packaged chip defect detection neural network model. And the detection result output by the packaged chip defect detection neural network model is the classification or grade of the chip defect picture.

In order to verify the effectiveness of the method, an industrial microscope camera is used for collecting and marking 1500 semiconductor IC package chip defect pictures, a package chip defect data set is manufactured for checking the method, and the migration learning effect of the method is verified. In the experiment, the chip defects are divided into training sets and verification sets according to the ratio of 8:2, the obtained results are shown in figure 9, the data such as accuracy, recall rate and the like are shown in table 1, and the indexes of the chip defects are better than those of the chip defects obtained by simply using Darknet-19 and ViT neural networks, so that the chip defects can be more effectively identified and extracted under the condition of a small sample. In addition, see FIG. 7 for comparison of accuracy of the chip defect detection experiment.

Table 1 comparison of chip Defect detection Algorithm Performance

Therefore, the defect detection method of the semiconductor IC packaging chip provided by the invention fuses the advantages of the first network model and the second network model based on migration learning; in particular, the advantages of convolutional neural networks and attention mechanisms are combined; firstly, by utilizing the advantages of Darknet convolutional neural networks, low-dimensional data features are extracted, features such as outlines, corner points and the like in the packaged chip pictures are effectively identified, then the advantages of global features and high-dimensional features in the pictures are better captured by means of ViT neural network algorithm, and the detection and identification effects on the defects of the packaged chips are greatly improved by combining the features.

According to the semiconductor chip detection method based on the transfer learning, the weight parameters of the first network model and the second network model are learned firstly based on the transfer learning, and the weight parameters of the network are retrained after the transfer learning is carried out on the models, so that the identification and extraction functions of the chip defects can be effectively completed under the condition of a small sample, and the scene requirements of the chip defect detection can be met.

On the other hand, referring to fig. 6, the present invention also provides a semiconductor chip detection device based on transfer learning, including:

The semiconductor chip detection device based on the transfer learning corresponds to the above-described semiconductor chip detection method based on the transfer learning, and is not cumbersome here.

The embodiment of the invention also provides electronic equipment, which comprises a processor and a memory, wherein executable instructions of the processor are stored. Wherein the processor is configured to execute the steps of the question-answering method via execution of the executable instructions.

As shown above, the electronic device according to the embodiment of the present disclosure can achieve the effects of the description question-answering method based on the neural network described above. Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" platform.

Fig. 8 is a schematic structural view of the electronic device of the present invention. An electronic device 400 according to this embodiment of the invention is described below with reference to fig. 8. The electronic device 400 shown in fig. 8 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 8, the electronic device 400 is embodied in the form of a general purpose computing device. The components of electronic device 400 may include, but are not limited to: at least one processing unit 410, at least one memory unit 420, a bus 430 connecting the different platform components (including memory unit 420 and processing unit 410), a display unit 440, and the like.

Wherein the storage unit stores program code that is executable by the processing unit 410 such that the processing unit 410 performs the steps according to various exemplary embodiments of the invention described in the method section of this specification. For example, the processing unit 410 may perform the steps shown in fig. 1.

The storage unit 420 may include readable media in the form of volatile storage units, such as Random Access Memory (RAM) 421 and/or cache memory 422, and may further include Read Only Memory (ROM) 423.

The storage unit 420 may also include a program/utility 424 having a set (at least one) of program modules 425, such program modules 425 including, but not limited to: processing systems, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Bus 430 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.

Electronic device 400 may also communicate with one or more external devices 40 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with electronic device 400, and/or any device (e.g., router, modem, etc.) that enables electronic device 400 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 450.

Also, electronic device 400 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 460. The network adapter 460 may communicate with other modules of the electronic device 400 via the bus 430. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 400, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage platforms, and the like.

The embodiment of the invention also provides a computer readable storage medium for storing a program, and the program is executed to implement the steps of the description question-answering method based on the neural network. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the above description of the method of question-answering portions of this specification, when the program product is run on the terminal device.

Referring to fig. 9, a program product 500 for implementing the above-described method according to an embodiment of the present disclosure is described.

A program product for implementing the above-described method according to an embodiment of the present invention may employ a portable compact disc read-only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out processes of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

In summary, the invention aims to provide a description question-answering method, a system, equipment and a storage medium based on a neural network, which use two different neural network models to further fine-arrange search engine results, thereby improving the accuracy of the description question-answering manual; and the advantages and disadvantages of the two neural networks are fully utilized, the number of the neural networks is determined according to target requirements, and the balance of performance and effect is achieved. Compared with the traditional BERT model, the invention only considers the text and the position information, adds the additional information such as the position information, the entity information, the co-occurrence information, the importance degree information and the like into the distributed expression of the words, and accelerates the training speed and the training accuracy compared with the traditional BRRT model through a model fine tuning technology. The whole method does not need to arrange a complete question-answer library manually according to an instruction manual; when the model is migrated, only the second sorting model is required to be finely tuned or trained, a large amount of data labeling is not required, and the manual sorting labeling cost in the conventional instruction manual question-answering system is reduced

The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims

1. The method is characterized by comprising the steps of constructing a packaged chip defect detection neural network model and training the packaged chip defect detection neural network model;

Constructing an initial first network model and an initial second network model, wherein the first network model is Darknet-19 convolutional neural networks; the second network model is ViT neural networks; the Darknet-19 include 19 convolutional layers, 5 max pooling layers;

migrating the initial first network model based on the large-scale data set to obtain a migrated first network model, wherein the migrating comprises the following steps:

Training the initial first network model based on a large-scale data set, and reserving the first n structural layers of the first network model; the first n-layer structure layer is divided into a plurality of network blocks, a residual error structure is added in the network blocks, and a convolution layer for changing the dimension of network output data is added in the last layer to form a migration first network model; the method comprises the steps of reserving the first 13 convolution layers and 5 maximum pooling layers of the Darknet-19 networks, dividing the layers into 5 network blocks, and adding a residual error structure in the last 4 network blocks; retraining the m-n layer after migrating the first network model when training the packaged chip defect detection neural network model;

constructing a packaged chip defect detection neural network model by using the migration first network model and the migration second network model; comprising the following steps: the input of the migration first network model is a three-dimensional image, and the output is a feature map vector; the feature map vector comprises low-dimensional features of a chip defect picture;

The input of the migration second network model comprises the output of the migration first network model and the picture to be detected, and the output is the classification or grade of the chip defect picture; the output of the migration first network model and the picture to be detected are spliced to form input data with the size of 16 multiplied by 392;

2. The method for detecting a semiconductor chip based on transfer learning according to claim 1, further comprising:

3. The method for detecting a semiconductor chip based on transfer learning according to claim 1, further comprising: and detecting the defects of the target packaged chip by using the packaged chip defect detection neural network model.

4. The method for detecting a semiconductor chip based on transfer learning according to claim 1, wherein the initial first network model is transferred based on a large-scale data set to obtain a transferred first network model; migrating the initial second network model based on the large-scale data set to obtain a migrated second network model,

Ω_s＝{D_s,L_s},Ω_t＝{D_t,L_t},Ψ_s≠Ψ_s

O_s＝A(D_s;W_s),O_t＝A(D_t;W_t)

W_t{l₁～l₂}＝W_s{l₁～l₂}

5. A semiconductor chip detection device based on transfer learning is characterized in that,

Training the initial first network model based on a large-scale data set, and reserving the first n structural layers of the first network model; the first n-layer structure layer is divided into a plurality of network blocks, a residual error structure is added in the network blocks, and a convolution layer for changing the dimension of network output data is added in the last layer to form a migration first network model; the method comprises the steps of reserving the first 13 convolution layers and 5 maximum pooling layers of the Darknet-19 networks, dividing the layers into 5 network blocks, and adding a residual error structure in the last 4 network blocks;

Retraining the m-n layer after migrating the first network model when training the packaged chip defect detection neural network model;

6. A storage medium having stored thereon a computer program which, when read and executed by a processor, implements the semiconductor chip inspection method based on transfer learning as claimed in any one of claims 1 to 4.