CN108268539A - Video matching system based on text analyzing - Google Patents
Video matching system based on text analyzing Download PDFInfo
- Publication number
- CN108268539A CN108268539A CN201611266235.0A CN201611266235A CN108268539A CN 108268539 A CN108268539 A CN 108268539A CN 201611266235 A CN201611266235 A CN 201611266235A CN 108268539 A CN108268539 A CN 108268539A
- Authority
- CN
- China
- Prior art keywords
- video
- subtitle
- index
- keyword
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/71—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7844—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
A kind of video matching system based on text analyzing, including:Caption analysis module, index module and search module, wherein:The time that word content and word content in caption analysis module extraction subtitle file occur in video, word content is segmented using stammerer participle, and using TF IDF algorithms the subtitle keyword of word content obtained to the word content after participle and at the beginning of subtitle keyword occurs in video and the end time, index module establishs or updates video index according to subtitle keyword and its using hash indexing method after starting and end time, subtitle keyword of the search module in search key input by user and video index compares and returns to the list of videos of similitude maximum, the present invention is realized puies forward the automatic process for establishing index according to subtitle, it ensure that the accuracy of search result, user is helped quickly to position search key corresponding time interval in video.
Description
Technical field
The present invention relates to a kind of technology of field of video retrieval, specifically a kind of video matching based on text analyzing
System.
Background technology
Network courses have been widely used in as a kind of educational media under current internet environment, more and more people
Knowledge is obtained by online education.Existing instructional video has the characteristics that the time is short but quantity is more.Existing video frequency searching
Technology is to be based on course description or video labeling, but course description cannot reflect the knowledge occurred in curriculum video completely
Point, in fact it could happen that description and the unmatched situation of content.Automated video mark needs to carry out key-frame extraction and right to video
Content in key frame is identified, but recognition effect is bad for instructional video, and accuracy rate is not high, and artificial
Mark corresponds to the video dependent on mark person the familiarity and abstract ability of course, while annotation results cannot equally be contained
All knowledge contents in lid curriculum video.
Invention content
The present invention is directed to deficiencies of the prior art, proposes a kind of video matching system based on text analyzing,
It can effectively ensure that the accuracy of search result.
The present invention is achieved by the following technical solutions:
The present invention includes:Caption analysis module, index module and search module, wherein:Caption analysis module extracts subtitle
The time that word content and word content in file occur in video divides word content using stammerer participle
Word, and the subtitle keyword of word content is obtained using TF-IDF algorithms to the word content after participle and subtitle keyword exists
In video occur at the beginning of and the end time, index module is according to subtitle keyword and its after starting and end time
Video index is establishd or updated using hash indexing method, search module is according to search key input by user and video index
In subtitle keyword compare and return to the list of videos of similitude maximum.
The stammerer participle is a kind of powerful Chinese word segmentation component, including accurate model, syntype and search engine
Three kinds of participle patterns of pattern, the present invention carry out text analyzing using accurate model.Stammerer participle is realized efficient based on prefix dictionary
The scanning of word figure, generate Chinese character in sentence and be possible into the directed acyclic graph (DAG) that word situation is formed, and using dynamically rule
Draw and search maximum probability path, find out the maximum cutting word combination based on word frequency, and for unregistered word, using based on Chinese character into word
The HMM model of ability, while use Viterbi algorithm.
The TF-IDF algorithms, i.e. term frequency-inverse document frequency algorithm occur in a document by a keyword
The weight of number and inverse document frequency, i.e. keyword in a document obtains the TF-IDF values of the keyword, some keyword pair
The importance of document is higher, then its TF-IDF values are bigger.The present invention obtains the pass in subtitle file using TF-IDF algorithms
Keyword and its keyword value, at the beginning of determining the keyword in corresponding video according to its sequence and the end time.
The hash indexing method is to carry out Hash operation using subtitle keyword as index key, by Hash operation result
It is deposited in a Hash table with corresponding line pointer information.The retrieval of hash index can be avoided multiple with one-time positioning
I/O is accessed, and improves search efficiency.
The present invention relates to a kind of matching process according to above system, include the following steps:
Step 1) reads the subtitle file in video by caption analysis module, in the word of the subtitle in current video
Appearance is analyzed, and extracts subtitle keyword;
Step 2) sends the subtitle set of keywords of acquisition to index module, and index module passes through training subtitle keyword
Video index or the existing video index of update are established, and will be in new index storage to database;
Step 3) obtains index file, and pass through input by user search when user inputs search key in systems
Rope keyword to carry out Similarity measures with the subtitle keyword in video index, obtains the highest set of keywords of similitude,
Return to the corresponding list of videos of user and temporal information that search key occurs in video.
Technique effect
Compared with prior art, the present invention realizes education video and corresponds to subtitle keyword according to caption recognition establishes automatically
The process of index builds term vector set by training word2vec, so as to by calculating the cosine similarity between word,
Search key with subtitle keyword is matched, is effectively guaranteed the accuracy of search result, it is crucial in extraction subtitle
Subtitle keyword is obtained during word, the time interval occurred is corresponded in video, user is helped quickly to position search key and is existed
Corresponding time interval in video.
Description of the drawings
Fig. 1 is present system structure diagram.
Specific embodiment
The experiment of the present invention is deployed on Ali's cloud host of 18 core 16G memory.It is downloaded after Telnet host first
The binary versions of word2vec then by Chinese and English corpus totally 370 ten thousand parts of articles of training wikipedia, take three
Hour, obtain an output file for containing all term vectors, size 8G.Meanwhile for a video caption file
First by being cut into the form of several small documents, 8 processes are opened simultaneously equally on this machine to above-mentioned small documents
Parallel carries out participle operation and is output in file.This step improves 65% place compared to the processing mode of individual process
Manage speed.Finally the index for building completion is stored in memory database Redis, reduces disk I/O, improves inquiry velocity about
20%.
As shown in Figure 1, the present embodiment includes:Caption analysis module, index module and search module, wherein:Caption analysis
The time that word content and word content in module extraction subtitle file occur in video, using stammerer participle to word
Content is segmented, and obtains the subtitle keyword and word of word content using TF-IDF algorithms to the word content after participle
At the beginning of curtain keyword occurs in video and the end time, index module according to subtitle keyword and its time started and
Video index is establishd or updated using hash indexing method after end time, search module is according to search key input by user
It is compared with the subtitle keyword in video index and returns to the list of videos of similitude maximum.
The caption analysis module is the basis of index module, and curriculum video file includes video file and subtitle text
Part, caption analysis module receive the subtitle file in curriculum video, the word content in subtitle file are analyzed, extraction text
The time point that subtitle keyword and subtitle keyword in word content occur in curriculum video.
The caption analysis module by write script obtain subtitle file in word content and its it is corresponding
It is the time relationship occurred in video, word content is segmented using stammerer participle later, to the word after the completion of participle
Content using TF-IDF algorithms obtain subtitle keyword and its video occur at the beginning of and the end time.
Memory module is preferably further provided in the present apparatus, memory module storage video index and video file.
The index module is mainly used for building video index, if there are no establishing video index in current system,
Video index is then established according to the subtitle keyword obtained in caption analysis module, otherwise will update existing video index, it
Afterwards by new video index storage to the database in memory module, facilitate query video.
The method that the index module uses hash index, read first video file subtitle keyword and its
Section is gathered, and reversely establishes subtitle keyword to the relationship of video and time interval, that is, builds subtitle keyword, video file
With the content item of time interval, cryptographic Hash then is calculated to subtitle keyword, corresponding entry is written in Hash bucket.For
There is hash-collision situation, we are solved in a manner that Hash table adds chained list.It can be added by the way of hash index
The structure and renewal process indexed soon.The update method of index uses original place more new strategy, i.e., directly in existing index structure
On modify.After increasing new curriculum video in video library and completing subtitle keyword extraction, directly in existing index
Increase corresponding subtitle keyword, video file and time interval entry in structure newly.Original place update can pass through cryptographic Hash
Whether directly positioning subtitle keyword has existed in original video index, so as to determine the mistake of additional entry or newly-increased entry
Journey.
The search module is the module that query result is generated to search key input by user.Detailed process is root
The search key provided according to user is matched with the subtitle keyword in video index, and the subtitle calculated in video index closes
Similitude between key word and the search key of inquiry returns to the corresponding list of videos of similitude maximum, and from database
Middle reading correlated curriculum video returns to user.Specific matching process is to build term vector set, meter by training word2vec
The cosine similarity between search key and index key is calculated to obtain best between subtitle keyword and search key
With result.Word2vec reads in the word in sliding window by constructing double-deck neural network, in input layer, their vector is added
Together, the node of hidden layer is formed.Output layer is then a binary tree built by Hofman tree algorithm, in hidden layer
Each node and the node of binary tree have the company sides of Weighted Coefficients.In given context, for a word w to be predicted,
At this moment it just allows the binary coding maximum probability of prediction word, then solves parameter by using the method that gradient declines.It is logical
Cross network struction into term vector model there is very high linguistics to evaluate, the relationship between two term vectors, can directly from
It is embodied in the difference of the two vectors.Such as C (king)-C (queen)=C (man)-C (woman).
The memory module includes file system and database, to store video index and all curriculum videos, side
Video index is obtained when being inquired after just, is also used in postorder keyword match, query video result is provided.
When system works, caption analysis module reads the subtitle file in video, to the word of the subtitle in current video
Content is analyzed, and is extracted subtitle keyword, is sent the subtitle set of keywords of acquisition to index module, index module passes through
Training subtitle keyword establishes video index or the existing video index of update, and will be in new index storage to database.When
When user inputs search key in systems, obtain index file, and by search key input by user come and video
Subtitle keyword in index carries out Similarity measures, obtains the highest set of keywords of similitude, it is corresponding to return to user
The temporal information that list of videos and search key occur in video.
Compared with prior art, the present invention realizes education video and corresponds to subtitle keyword according to caption recognition and build automatically
The process that lithol draws builds term vector set, so as to similar by calculating the cosine between word by training word2vec
Degree, search key with subtitle keyword is matched, is effectively guaranteed the accuracy of search result, is closed in extraction subtitle
Subtitle keyword is obtained during key word, the time interval occurred is corresponded in video, user is helped quickly to position search key
Corresponding time interval in video.It is compared with existing method, search time reduces about 8%, and search accuracy rate improves
About 6%.
Above-mentioned specific implementation can by those skilled in the art under the premise of without departing substantially from the principle of the invention and objective with difference
Mode carry out local directed complete set to it, protection scope of the present invention is subject to claims and not by above-mentioned specific implementation institute
Limit, each implementation within its scope is by the constraint of the present invention.
Claims (5)
1. a kind of video matching system based on text analyzing, which is characterized in that including:Caption analysis module, index module and
Search module, wherein:What word content and word content in caption analysis module extraction subtitle file occurred in video
Time segments word content, and obtain text using TF-IDF algorithms to the word content after participle using stammerer participle
At the beginning of the subtitle keyword and subtitle keyword of word content occur in video and the end time, index module according to
Subtitle keyword and its video index, search module are establishd or updated using hash indexing method after starting and end time
Subtitle keyword in search key input by user and video index compares and returns to the video row of similitude maximum
Table.
2. the video matching system according to claim 1 based on text analyzing, it is characterized in that, stammerer participle packet
Three kinds of accurate model, syntype and search engine pattern participle patterns are included, it is efficient to be based on the realization of prefix dictionary for wherein accurate model
The scanning of word figure, generate Chinese character in sentence and be possible into the directed acyclic graph that word situation is formed, and use Dynamic Programming is looked into
Maximum probability path is looked for, finds out the maximum cutting word combination based on word frequency.
3. the video matching system according to claim 2 based on text analyzing, it is characterized in that, the TF-IDF is calculated
Method, i.e. term frequency-inverse document frequency algorithm, the number and inverse document frequency occurred in a document by a keyword, i.e.,
The weight of keyword in a document is obtained the TF-IDF values of the keyword, can be obtained in subtitle file using TF-IDF algorithms
Keyword and its keyword value, at the beginning of determining the keyword in corresponding video according to its sequence and at the end of
Between.
4. the video matching system according to claim 1 based on text analyzing, it is characterized in that, the hash index side
Method is to carry out Hash calculation to the subtitle keyword extracted to obtain cryptographic Hash, and Hash operation result and corresponding row are referred to
Needle information is deposited in a Hash table, and video caption key word index is established with this.
5. a kind of matching process of the system according to any of the above-described claim, which is characterized in that include the following steps:
Step 1) reads the subtitle file in video by caption analysis module, to the word content of the subtitle in current video into
Row analysis, extracts subtitle keyword;
Step 2) sends the subtitle set of keywords of acquisition to index module, and index module is established by training subtitle keyword
Video index or the existing video index of update, and will be in new index storage to database;
Step 3) obtains index file when user inputs search key in systems, and passes through search input by user and close
Key word to carry out Similarity measures with the subtitle keyword in video index, obtains the highest set of keywords of similitude, returns
The temporal information occurred in video to the corresponding list of videos of user and search key.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611266235.0A CN108268539A (en) | 2016-12-31 | 2016-12-31 | Video matching system based on text analyzing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611266235.0A CN108268539A (en) | 2016-12-31 | 2016-12-31 | Video matching system based on text analyzing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108268539A true CN108268539A (en) | 2018-07-10 |
Family
ID=62770149
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611266235.0A Pending CN108268539A (en) | 2016-12-31 | 2016-12-31 | Video matching system based on text analyzing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108268539A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502661A (en) * | 2019-07-08 | 2019-11-26 | 天脉聚源(杭州)传媒科技有限公司 | A kind of video searching method, system and storage medium |
CN111008300A (en) * | 2019-11-20 | 2020-04-14 | 四川互慧软件有限公司 | Keyword-based timestamp positioning search method in audio and video |
KR102119246B1 (en) * | 2019-06-10 | 2020-06-04 | (주)사맛디 | System, method and program for searching image data by using deep-learning algorithm |
CN111324768A (en) * | 2020-02-12 | 2020-06-23 | 新华智云科技有限公司 | Video searching system and method |
WO2020155750A1 (en) * | 2019-01-28 | 2020-08-06 | 平安科技(深圳)有限公司 | Artificial intelligence-based corpus collecting method, apparatus, device, and storage medium |
CN111523302A (en) * | 2020-07-06 | 2020-08-11 | 成都晓多科技有限公司 | Syntax analysis method and device, storage medium and electronic equipment |
CN111949827A (en) * | 2020-07-29 | 2020-11-17 | 深圳神目信息技术有限公司 | Video plagiarism detection method, device, equipment and medium |
WO2020251236A1 (en) * | 2019-06-10 | 2020-12-17 | (주)사맛디 | Image data retrieval method, device, and program using deep learning algorithm |
WO2020251233A1 (en) * | 2019-06-10 | 2020-12-17 | (주)사맛디 | Method, apparatus, and program for obtaining abstract characteristics of image data |
CN112817916A (en) * | 2021-02-07 | 2021-05-18 | 中国科学院新疆理化技术研究所 | Data acquisition method and system based on IPFS |
CN112836008A (en) * | 2021-02-07 | 2021-05-25 | 中国科学院新疆理化技术研究所 | Index establishing method based on decentralized storage data |
CN112910674A (en) * | 2019-12-04 | 2021-06-04 | ***通信集团设计院有限公司 | Physical site screening method and device, electronic equipment and storage medium |
CN113051966A (en) * | 2019-12-26 | 2021-06-29 | ***通信集团重庆有限公司 | Video keyword processing method and device |
CN113743352A (en) * | 2021-09-15 | 2021-12-03 | 央视国际网络无锡有限公司 | Method and device for comparing similarity of video contents |
CN114143613A (en) * | 2021-12-03 | 2022-03-04 | 北京影谱科技股份有限公司 | Video subtitle time alignment method, system and storage medium |
CN114780789A (en) * | 2022-06-22 | 2022-07-22 | 山东建筑大学 | Assembly type component construction monitoring video positioning method based on natural language query |
CN117033673A (en) * | 2023-05-16 | 2023-11-10 | 广州比地数据科技有限公司 | Multimedia content extraction system based on artificial intelligence |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101021857A (en) * | 2006-10-20 | 2007-08-22 | 鲍东山 | Video searching system based on content analysis |
CN101137030A (en) * | 2006-09-01 | 2008-03-05 | 索尼株式会社 | Apparatus, method and program for searching for content using keywords from subtitles |
CN101630524A (en) * | 2008-07-18 | 2010-01-20 | 广明光电股份有限公司 | Method for searching multimedia contents |
CN103686200A (en) * | 2013-12-27 | 2014-03-26 | 乐视致新电子科技(天津)有限公司 | Intelligent television video resource searching method and system |
US20140140681A1 (en) * | 2012-11-21 | 2014-05-22 | Hon Hai Precision Industry Co., Ltd. | Video content search method, system, and device |
-
2016
- 2016-12-31 CN CN201611266235.0A patent/CN108268539A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101137030A (en) * | 2006-09-01 | 2008-03-05 | 索尼株式会社 | Apparatus, method and program for searching for content using keywords from subtitles |
CN101021857A (en) * | 2006-10-20 | 2007-08-22 | 鲍东山 | Video searching system based on content analysis |
CN101630524A (en) * | 2008-07-18 | 2010-01-20 | 广明光电股份有限公司 | Method for searching multimedia contents |
US20140140681A1 (en) * | 2012-11-21 | 2014-05-22 | Hon Hai Precision Industry Co., Ltd. | Video content search method, system, and device |
CN103686200A (en) * | 2013-12-27 | 2014-03-26 | 乐视致新电子科技(天津)有限公司 | Intelligent television video resource searching method and system |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020155750A1 (en) * | 2019-01-28 | 2020-08-06 | 平安科技(深圳)有限公司 | Artificial intelligence-based corpus collecting method, apparatus, device, and storage medium |
WO2020251233A1 (en) * | 2019-06-10 | 2020-12-17 | (주)사맛디 | Method, apparatus, and program for obtaining abstract characteristics of image data |
KR102119246B1 (en) * | 2019-06-10 | 2020-06-04 | (주)사맛디 | System, method and program for searching image data by using deep-learning algorithm |
WO2020251236A1 (en) * | 2019-06-10 | 2020-12-17 | (주)사맛디 | Image data retrieval method, device, and program using deep learning algorithm |
CN110502661A (en) * | 2019-07-08 | 2019-11-26 | 天脉聚源(杭州)传媒科技有限公司 | A kind of video searching method, system and storage medium |
CN111008300A (en) * | 2019-11-20 | 2020-04-14 | 四川互慧软件有限公司 | Keyword-based timestamp positioning search method in audio and video |
CN112910674B (en) * | 2019-12-04 | 2023-04-18 | ***通信集团设计院有限公司 | Physical site screening method and device, electronic equipment and storage medium |
CN112910674A (en) * | 2019-12-04 | 2021-06-04 | ***通信集团设计院有限公司 | Physical site screening method and device, electronic equipment and storage medium |
CN113051966A (en) * | 2019-12-26 | 2021-06-29 | ***通信集团重庆有限公司 | Video keyword processing method and device |
CN111324768A (en) * | 2020-02-12 | 2020-06-23 | 新华智云科技有限公司 | Video searching system and method |
CN111324768B (en) * | 2020-02-12 | 2023-07-28 | 新华智云科技有限公司 | Video searching system and method |
CN111523302A (en) * | 2020-07-06 | 2020-08-11 | 成都晓多科技有限公司 | Syntax analysis method and device, storage medium and electronic equipment |
CN111949827A (en) * | 2020-07-29 | 2020-11-17 | 深圳神目信息技术有限公司 | Video plagiarism detection method, device, equipment and medium |
CN111949827B (en) * | 2020-07-29 | 2023-10-24 | 深圳神目信息技术有限公司 | Video plagiarism detection method, device, equipment and medium |
CN112817916A (en) * | 2021-02-07 | 2021-05-18 | 中国科学院新疆理化技术研究所 | Data acquisition method and system based on IPFS |
CN112836008B (en) * | 2021-02-07 | 2023-03-21 | 中国科学院新疆理化技术研究所 | Index establishing method based on decentralized storage data |
CN112817916B (en) * | 2021-02-07 | 2023-03-31 | 中国科学院新疆理化技术研究所 | Data acquisition method and system based on IPFS |
CN112836008A (en) * | 2021-02-07 | 2021-05-25 | 中国科学院新疆理化技术研究所 | Index establishing method based on decentralized storage data |
CN113743352A (en) * | 2021-09-15 | 2021-12-03 | 央视国际网络无锡有限公司 | Method and device for comparing similarity of video contents |
CN114143613A (en) * | 2021-12-03 | 2022-03-04 | 北京影谱科技股份有限公司 | Video subtitle time alignment method, system and storage medium |
CN114143613B (en) * | 2021-12-03 | 2023-07-21 | 北京影谱科技股份有限公司 | Video subtitle time alignment method, system and storage medium |
CN114780789A (en) * | 2022-06-22 | 2022-07-22 | 山东建筑大学 | Assembly type component construction monitoring video positioning method based on natural language query |
CN117033673A (en) * | 2023-05-16 | 2023-11-10 | 广州比地数据科技有限公司 | Multimedia content extraction system based on artificial intelligence |
CN117033673B (en) * | 2023-05-16 | 2024-04-05 | 广州比地数据科技有限公司 | Multimedia content extraction system based on artificial intelligence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108268539A (en) | Video matching system based on text analyzing | |
Bakhtin et al. | Real or fake? learning to discriminate machine from human generated text | |
CN110245259B (en) | Video labeling method and device based on knowledge graph and computer readable medium | |
Maharjan et al. | A multi-task approach to predict likability of books | |
CN107608960B (en) | Method and device for linking named entities | |
CN108304373B (en) | Semantic dictionary construction method and device, storage medium and electronic device | |
CN106126619A (en) | A kind of video retrieval method based on video content and system | |
WO2013170587A1 (en) | Multimedia question and answer system and method | |
CN108228758A (en) | A kind of file classification method and device | |
US20190317986A1 (en) | Annotated text data expanding method, annotated text data expanding computer-readable storage medium, annotated text data expanding device, and text classification model training method | |
CN107679110A (en) | The method and device of knowledge mapping is improved with reference to text classification and picture attribute extraction | |
CN110222328B (en) | Method, device and equipment for labeling participles and parts of speech based on neural network and storage medium | |
CN109635157A (en) | Model generating method, video searching method, device, terminal and storage medium | |
CN106537387B (en) | Retrieval/storage image associated with event | |
CN109522396B (en) | Knowledge processing method and system for national defense science and technology field | |
CN112966117A (en) | Entity linking method | |
CN112069312A (en) | Text classification method based on entity recognition and electronic device | |
CN114297388A (en) | Text keyword extraction method | |
CN115544303A (en) | Method, apparatus, device and medium for determining label of video | |
CN111930937A (en) | BERT-based intelligent government affair text multi-classification method and system | |
CN113408282B (en) | Method, device, equipment and storage medium for topic model training and topic prediction | |
Sheikh et al. | Document level semantic context for retrieving OOV proper names | |
CN116049376B (en) | Method, device and system for retrieving and replying information and creating knowledge | |
CN115906835B (en) | Chinese question text representation learning method based on clustering and contrast learning | |
CN112528653A (en) | Short text entity identification method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180710 |