CN106131582A - A kind of wrong source based on video text message investigation method - Google Patents

A kind of wrong source based on video text message investigation method Download PDF

Info

Publication number
CN106131582A
CN106131582A CN201610554564.9A CN201610554564A CN106131582A CN 106131582 A CN106131582 A CN 106131582A CN 201610554564 A CN201610554564 A CN 201610554564A CN 106131582 A CN106131582 A CN 106131582A
Authority
CN
China
Prior art keywords
video
source
text message
program
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610554564.9A
Other languages
Chinese (zh)
Other versions
CN106131582B (en
Inventor
刘强
王长福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Casicloud Co ltd
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201610554564.9A priority Critical patent/CN106131582B/en
Publication of CN106131582A publication Critical patent/CN106131582A/en
Application granted granted Critical
Publication of CN106131582B publication Critical patent/CN106131582B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/232Content retrieval operation locally within server, e.g. reading video streams from disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26291Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for providing content or additional data updates, e.g. updating software modules, stored at the client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2665Gathering content from different sources, e.g. Internet and satellite
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Astronomy & Astrophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of wrong source based on video text message investigation method, search for each video website by programm name, determine whether each video website has should the video source of programm name;Video playback link is captured: use the mode periodically captured from the video website that video source is corresponding;Storage captures result and forms crawl historical record;The text message captured in historical record is analyzed, finds out wrong source, and delete the crawl result that in crawl historical record, wrong source is corresponding accordingly;According to the crawl historical record in the error-free source eventually formed, converged in television by agreement, with program by the way of present and realize playing.The present invention, when capturing video playback link, carries out video grabber misarrangement based on text message, solves " the not homology of the same name " problem occurred in TV internet video aggregated application.

Description

A kind of wrong source based on video text message investigation method
Technical field
The present invention relates to TV internet video polymerization field of error correction, a kind of mistake based on video text message Source investigation method.
Background technology
Traditional TV programme are to present and realize to play after television station edits meticulously in the way of television channel.
Along with integration of three networks process is gradually risen, (TV internet video is polymerized TV internet video aggregated application APP) technology enters the life of people and develops rapidly, and this technology is by the content of multimedia on the Internet (especially in video Hold) present in the way of program in television and realize playing.
The each program presented in television typically corresponds to more than one video source, therefore, this technology require first from Video playback link is captured, the most again by agreement (stream media protocol, such as HLS (HTTP on the video website that video source is corresponding The dynamic code rate adaptive technique of Live Streaming, Apple) agreement etc.) converge in television, present in the way of program And realize playing.
The method of the crawl video playback link of current main-stream is: by programm name coupling video source and capture video and broadcast Put link, but this method exist " not homology of the same name " problem, such as:
" Hero Shooting Vulture " is widely known TV play, has several version in history, chronologically divide have 1983 editions, 1994 editions, 2003 editions, 2008 editions etc., when by programm name coupling video source and when capturing video playback link, program names Claiming is all " Hero Shooting Vulture ", and the video playback link now grabbed also needs to the details ability that comparison is relevant with film The particular content known.
That is: the difference that programm name can not correctly be distinguished on programme content is only relied on.Problems is led at video aggregation Territory is to take place frequently and inevitable problem.For correcting action, generally require the highest human cost.
Summary of the invention
For defect present in prior art, it is an object of the invention to provide a kind of mistake based on video text message Source investigation method, when capturing video playback link, carries out video grabber misarrangement based on text message, solves TV the Internet and regards Frequently " the not homology of the same name " problem occurred in aggregated application.
For reaching object above, the present invention adopts the technical scheme that:
A kind of wrong source based on video text message investigation method, it is characterised in that comprise the steps:
Step 1, search for each video website by programm name, determines whether each video website has should programm name Video source;
Step 2, captures video playback link: use the mode periodically captured from the video website that video source is corresponding, At least capture herein below:
The video playback corresponding with video source links,
For demarcating the text message of the video content of this video source;
Step 3, the crawl result of storing step 2 is formed and captures historical record;
Step 4, is analyzed the text message captured in historical record, finds out wrong source, and deletes crawl history accordingly The crawl result that in record, wrong source is corresponding;
Step 5, according to the crawl historical record in the error-free source that step 4 eventually forms, converged in television by agreement, with The mode of program presents and realizes playing.
On the basis of technique scheme, in step 1, described programm name includes but not limited to: TV play title or Movie name.
On the basis of technique scheme, step 1 is realized by shell script.
On the basis of technique scheme, in step 2, the mode that described periodicity captures refers to:
Preset the update cycle of program and the priority of program,
The update cycle of the high then corresponding program of the priority of program is short,
The update cycle of the low then corresponding program of priority of program is long,
The range of choice of the update cycle of program is: 1 hour to 1 week;
Update cycle according to program periodically carries out grasping manipulation.
On the basis of technique scheme, in step 2, the text envelope of the described video content for demarcating this video source Breath at least includes: direct, act the leading role, the age, program category, area, programme contribution number, single collection duration, another name and brief introduction.
On the basis of technique scheme, in step 3, formed and capture historical record and with list of meta data metadata The form storage of list;
List of meta data metadata list includes at least one metadata record;
Every metadata records and at least stores herein below: programm name, for demarcating the video content of this video source Text message.
On the basis of technique scheme, in step 3, described formation captures historical record, specifically comprises the following steps that
Judge whether the programm name of the new video source captured is present in capturing in historical record,
If not existing, a most newly-built metadata new video source captured of record storage;
If existing, a most newly-built metadata new video source captured of record storage, hold after grasping manipulation completes Row step 4.
On the basis of technique scheme, in step 4, specifically comprise the following steps that
The metadata that programm name in crawl historical record is identical is recorded and carries out similarity mode:
The text of the video content for demarcate this video source in two metadata records identical to programm name Information carries out similarity mode item by item;
The result of comprehensive every similarity mode;
If similarity meets or exceeds criterion, then it is assumed that the new video source captured is deposited in recording with metadata earlier The video source of storage is same program;
According to the text message element of the new video source captured, the text message element in completion metadata record;
If similarity is not up to criterion, then it is assumed that the new video source captured is wrong source, should give eliminating, will newly capture Video source as new program.
On the basis of technique scheme, described similarity criterion is:
Regard text message element as parameter in a vector respectively;
It is i.e. each parameter in above-mentioned vector to be compared respectively that similarity judges, obtains similar value, then by similar value It is added the similarity of the text message obtaining the video content for demarcating this video source;
Described similar value is all normalized with similarity, finally give in the video demarcating this video source The form of the similarity percentage ratio of the text message held represents.
Wrong source based on video text message of the present invention investigation method, when capturing video playback link, based on Text message carries out video grabber misarrangement, solves " the not homology of the same name " problem occurred in TV internet video aggregated application.
Accompanying drawing explanation
The present invention has a drawings described below:
Fig. 1 mistake source investigation flow chart;
Fig. 2 historical record forms flow chart;
The structural representation of Fig. 3 historical record.
Detailed description of the invention
Below in conjunction with accompanying drawing, the present invention is described in further detail.
As shown in Figures 1 to 3, wrong source based on video text message of the present invention investigation method, comprise the steps:
Step 1, searches for each video website by programm name (also known as title), determines whether each video website has correspondence The video source of this programm name;
Step 2, captures video playback link: use the mode periodically captured from the video website that video source is corresponding, At least capture herein below:
The video playback corresponding with video source links,
For demarcating the text message of the video content of this video source;
Step 3, the crawl result of storing step 2 is formed and captures historical record (referred to as historical record);
Step 4, is analyzed the text message captured in historical record, finds out wrong source, and deletes crawl history accordingly The crawl result that in record, wrong source is corresponding;
Step 5, according to the crawl historical record in the error-free source that step 4 eventually forms, converged in television by agreement, with The mode of program presents and realizes playing.
On the basis of technique scheme, in step 1, described programm name includes but not limited to: TV play title or Movie name.
Further, programm name can be some or certain the several keywords in TV play title or movie name.
Further, programm name can be simplified form of Chinese Character, Chinese-traditional, Korean, Japanese or English.
On the basis of technique scheme, step 1 is realized by shell script.Wherein:
Comprising the video station point list of acquiescence in shell script, described search each video website is the video website by acquiescence List scans for one by one;
The video station point list of described acquiescence is stored in shell script;
And/or: comprising self-defining video station point list in shell script, described search each video website is by self-defined Video station point list scan for one by one;
Described self-defining video station point list is stored in this locality of shell script place equipment;
And/or: comprising the video station point list in high in the clouds in shell script, described search each video website is regarding by high in the clouds Frequently site list scans for one by one;
The video station point list in described high in the clouds is stored in one or more Cloud Server.
On the basis of technique scheme, in step 2, the mode that described periodicity captures refers to:
Preset the update cycle of program and the priority of program,
The update cycle of the high then corresponding program of the priority of program is short,
The update cycle of the low then corresponding program of priority of program is long,
The range of choice of the update cycle of program is: 1 hour to 1 week;
Update cycle according to program periodically carries out grasping manipulation.
Wherein;
The priority of program is ranked up according to the frequent degree of retrieval of recent programm name;
Described include but not limited in the recent period: the same day, nearest three days, nearest one week or nearest one month;
And/or: the priority of program is ranked up according to the issuing date distance of program;
And/or: the priority of program is ranked up according to user's history viewing record;
In described user's history viewing record, the content of record includes but not limited to: the duration of user's viewing and the day of viewing Time phase, the type of user's viewing, the production company of user's viewing, the director of user's viewing or the protagonist of user's viewing;
Preferably, the duration of user's viewing and the date-time of viewing should at least be included, according to the date of viewing Time Calculation learns viewing on weekdays or weekend, further according to being working day or weekend the same day, in conjunction with user's viewing time Long, it is ranked up according to the duration of program.
On the basis of technique scheme, in step 2, the text envelope of the described video content for demarcating this video source Breath at least includes: direct, act the leading role, the age, program category, area, programme contribution number, single collection duration, another name and brief introduction.
Further, described " direct, act the leading role, the age, program category, area, programme contribution number, single collection duration, another name and letter It is situated between " it is text message element (the text message element in the text message of the video content demarcating this video source), as Wherein some or certain several text information element elements lack, then the text message element of this disappearance is left a blank, or is filled to "None" word, Or it is filled to " disappearance " word etc. to show difference.
On the basis of technique scheme, in step 3, formed and capture historical record and with list of meta data metadata The form storage of list;
List of meta data metadata list includes at least one metadata record, it may be assumed that some metadata notes Record i.e. constitutes crawl historical record of the present invention;
Every metadata records and at least stores herein below: programm name (title), for demarcating regarding of this video source Frequently the text message of content.
Metadata definition: about the information of tissue, data field and the relation thereof of data, in short, metadata be exactly about The data of data.
Another embodiment is: every metadata record storage herein below: programm name (title), video playback Link, for demarcating the text message of the video content of this video source.Need to say: how video playback link processes and be not Present invention key content to be protected, therefore the content relating to video playback link is no longer described in detail.
On the basis of technique scheme, in step 3, described formation captures historical record, specifically comprises the following steps that
Judge whether the programm name of the new video source captured is present in capturing in historical record,
If not existing, a most newly-built metadata new video source captured of record storage;
If existing, a most newly-built metadata new video source captured of record storage, hold after grasping manipulation completes Row step 4.
On the basis of technique scheme, in step 4, specifically comprise the following steps that
The metadata that programm name in crawl historical record is identical is recorded and carries out similarity mode:
The text of the video content for demarcate this video source in two metadata records identical to programm name Information carries out similarity mode item by item;
The result of comprehensive every similarity mode;
If similarity meets or exceeds criterion, then it is assumed that the new video source captured is deposited in recording with metadata earlier The video source of storage is same program;
According to the text message element of the new video source captured, the text message element in completion metadata record;
If similarity is not up to criterion, then it is assumed that the new video source captured is wrong source, should give eliminating, will newly capture Video source as new program.
On the basis of technique scheme, described similarity criterion is:
By text message element (guidance drills, acts the leading role, the age, program category, area, programme contribution number, single collection duration, another name, Brief introduction) regard the parameter in a vector respectively as;
It is i.e. each parameter in above-mentioned vector to be compared respectively that similarity judges, obtains similar value, then by similar value It is added the similarity of the text message obtaining the video content for demarcating this video source;
Described similar value is all normalized with similarity, finally give in the video demarcating this video source The form of the similarity percentage ratio of the text message held represents.
On the basis of technique scheme, described each parameter compares respectively mainly following three kinds of modes:
Mode 1: also referred to as discrete class Boolean type compares, if to refer to that the parameter compared only exists identical or not for which With two kinds of results, then the similar value be given only has two kinds of values;
Citing: if the director of two programs that compare is identical, then the similar value " directing " this parameter is 1, is otherwise 0; Merely illustrative, according to algorithm operational effect, when comparative result is for differing, typically will not take 0, time identical, also may not take 1, but It is that comparative result necessarily only has two kinds of values;
Mode 2: also referred to as seriality type compares, and when which refers to the parameter difference compared, is normalized and reflects Penetrating process, its similar value is certain value on [0,1];
Described normalized frequently with method have Method of Cosine, sigmoid function, index method;
Citing: if the age of two programs that compare is identical, being then output as 1, if differing, providing one according to Method of Cosine Individual similar value;Such as metadata record in historical record is middle aged on behalf of 2016, the newly age information in warehouse-in information Closer to 2016 then its Similarity value the biggest, such as 2015 similar value be 0.9,2000 be 0.2;
Mode 3: also referred to as simhash type compares, which refers to for rich text information, uses known Simhash method, first obtains the cryptographic Hash of two rich text information, then calculates the Hamming distances of cryptographic Hash, finally according to This Hamming distances is obtained similar value as normalized by the figure place of cryptographic Hash;
Citing: the brief introduction to two program A and B calculates cryptographic Hash respectively, it is assumed that be expressed as hashA=with 6 110001, hashB=101011;Then the Hamming distances of two cryptographic Hash is: hamingD (hashA, hashB)=count_1 (A Xor B)=count_1 (100001)=2.The span of Hamming distances is relevant to the figure place of cryptographic Hash, the most permissible This distance is made normalized, and this processing procedure can simply be expressed as: when Hamming distances is 6, similar value is 1,;It it is 0 phase It is 0 like value, during other values, uses being uniformly distributed on [0, maxbit (hash)] interval to quantify.In this example, the similar value of 2 For:
1*bit [2, maxbit (hash)]/count [0, maxbit (hash)]=1*bit [2,6]/count [0,6]= 1*3/7=0.43
By the calculating of three of the above mode, after obtaining the similar value of the relatively rear each parameter of two vectors, it is multiplied by every Weight factor is also added, and obtains final similarity.This weight factor comes from the experience of long campaigns video traffic accumulation;
Citing: the weight factor of director is 0.2, and the weight factor of protagonist is 0.3, it will be understood that, two same reputation and integrity Mesh, it is bigger that (comparing director identical) acts the leading role its similar probability identical.Because the element number in director's set is acted the leading role relatively For to lack.This inference is a kind of a kind of probability retrodicted out from result, is really that situation should be much more complex.
The forming process (which includes similarity comparison process) of historical record is described in detail below by way of citing.This act Step 3 in example correspondence detailed description of the invention and content described in step 4.
If it is as follows to have there is a record in historical record:
Metadata_ORG{
Programm name: Hero Shooting Vulture,
Director: Li Guoli,
Act the leading role: (actor1: Lin Yichen, actor2: Hu Ge ...),
Age: 2008,
Program category: (tag1: swordsman),
Area: China's Mainland,
Programme contribution number: null,
Single collection duration: null,
Another name: (name1:08 version penetrates carving),
Brief introduction: Southern Song Dynasty period, monarch ...)
};
If capturing the metadata record of two new warehouse-ins:
Metadata1{
Programm name: Hero Shooting Vulture,
Director: Li Tiansheng,
Act the leading role: (actor1: Zhang Zhilin, actor2: Zhu Yin ...),
Age: 1994,
Program category: (tag1: swordsman),
Area: Hong Kong,
Programme contribution number: 35,
Single collection duration: null,
Another name: (name1: penetrate hero),
Brief introduction: story occurs ...)
};
Metadata2{
Programm name: Hero Shooting Vulture,
Director: Li Guoli,
Act the leading role: (actor1: Hu Ge, actor2: Lin Yichen ...),
Age: 2007,
Program category: (tag1: swordsman, tag2: love, tag3: ancient costume),
Area: China's Mainland,
Programme contribution number: 50,
Single collection duration: 43,
Another name: (name1: new Hero Shooting Vulture, name2:08 version penetrates carving),
Brief introduction: Southern Song Dynasty period, monarch ...)
};
Calculating process is as follows:
Step 1 is tabled look-up in historical record and is found that the programm name of metadata1 with metadata_ORG is identical, then start Calculate the similarity degree of two records.
Two records are regarded as two vectors comprising some parameters by step 2, first calculate parameter in metadata1 and " lead Drill " value " Li Tiansheng " and metadata_ORG in similar value between the value " Li Guoli " " directed " of parameter.Assume this parameter Occupation mode 1 calculates (discrete class Boolean type compares), and due to director's difference, result of calculation is 0.1;By that analogy, according to parameter A kind of similar value calculating each parameter in three kinds of modes of type selecting.Assume to finally obtain following similar value result of calculation:
(director: 0.1, acts the leading role: 0.1, the age: 0.2, program category: 0.7 simVector1=, area: 0.2, programme contribution Number: null, single collection duration: null, another name: 0.1, brief introduction: 0.8)
Every in simVector1 is multiplied by weight factor and is added by step 3.Weight factor also can be regarded as one to Amount, it is assumed that weight factor vector is:
(director: 0.2. acts the leading role weightVector: 0.3, the age: 0.05, program category: 0.1, area: 0.1, programme contribution Number: 0.05, single collection duration: 0.05, another name: 0.05, brief introduction: 0.1) the most final similarity:
SimValue1=simVector1*weightVector=0.1*0.2+0.1*0.3+0.2*0. 05+0.7*0.1+ 0.2*0.1+0*0.05+0*0.05+0.1*0.05+0.8*0.1=0.235
Step 4 judges whether similarity reaches criterion.If criterion is similarity is less than 0.5, then it is assumed that be wrong Source, then metadata1 is judged as wrong source, should give and excludes metadata_ORG place record, and a newly-built record “metadata1”。
Step 5 is tabled look-up in historical record and is found that the programm name of metadata2 with metadata_ORG is identical, then start Calculate the similarity degree of two records.
Step 6 repeat the above steps 2-3, it is assumed that the most calculated similarity simValue2=0.65
Step 7 judges whether similarity reaches criterion.If criterion is similarity is less than 0.5, then it is assumed that be wrong Source, then metadata2 is judged as homology, now according to the content completion metadata_ in more new regulation metadata2 Content in ORG.
Sum up: it can be seen that metadata2 Yu metadata_ORG description is same program, but not all Parameter is all identical with historical record, and such as age, performer order, partial parameters is incomplete.Now by Similarity Measure just Can find out that the two has a higher similarity, and by information completion metadata_ORG in metadata2;, when similarity not Reaching the video source that criterion (metadata1) then thinks that text information demarcates is wrong source, should give eliminating.
The content not being described in detail in this specification belongs to prior art known to professional and technical personnel in the field.

Claims (9)

1. wrong source based on a video text message investigation method, it is characterised in that comprise the steps:
Step 1, searches for each video website by programm name, determines whether each video website has should the video of programm name Source;
Step 2, captures video playback link: use the mode periodically captured, at least from the video website that video source is corresponding Crawl herein below:
The video playback corresponding with video source links,
For demarcating the text message of the video content of this video source;
Step 3, the crawl result of storing step 2 is formed and captures historical record;
Step 4, is analyzed the text message captured in historical record, finds out wrong source, and deletes crawl historical record accordingly The crawl result that middle wrong source is corresponding;
Step 5, according to the crawl historical record in the error-free source that step 4 eventually forms, is converged in television by agreement, with program Mode present and realize play.
2. wrong source based on video text message as claimed in claim 1 investigation method, it is characterised in that: in step 1, described Programm name includes but not limited to: TV play title or movie name.
3. wrong source based on video text message as claimed in claim 1 investigation method, it is characterised in that: step 1 passes through foot This program realizes.
4. wrong source based on video text message as claimed in claim 1 investigation method, it is characterised in that: in step 2, described The mode periodically captured refers to:
Preset the update cycle of program and the priority of program,
The update cycle of the high then corresponding program of the priority of program is short,
The update cycle of the low then corresponding program of priority of program is long,
The range of choice of the update cycle of program is: 1 hour to 1 week;
Update cycle according to program periodically carries out grasping manipulation.
5. wrong source based on video text message as claimed in claim 1 investigation method, it is characterised in that: in step 2, described At least include for demarcating the text message of the video content of this video source: direct, act the leading role, the age, program category, area, joint Mesh collection number, single collection duration, another name and brief introduction.
6. wrong source based on video text message as claimed in claim 1 investigation method, it is characterised in that: in step 3, formed Capture historical record and store with the form of list of meta data metadata list;
List of meta data metadata list includes at least one metadata record;
Every metadata records and at least stores herein below: programm name, for demarcating the literary composition of the video content of this video source This information.
7. wrong source based on video text message as claimed in claim 1 investigation method, it is characterised in that: in step 3, described Formed and capture historical record, specifically comprise the following steps that
Judge whether the programm name of the new video source captured is present in capturing in historical record,
If not existing, a most newly-built metadata new video source captured of record storage;
If existing, a most newly-built metadata new video source captured of record storage, after grasping manipulation completes, perform step Rapid 4.
8. wrong source based on video text message as claimed in claim 1 investigation method, it is characterised in that: in step 4, specifically Step is as follows:
The metadata that programm name in crawl historical record is identical is recorded and carries out similarity mode:
The text message of the video content for demarcate this video source in two metadata records identical to programm name Carry out similarity mode item by item;
The result of comprehensive every similarity mode;
If similarity meets or exceeds criterion, then it is assumed that the new video source captured stores in recording with metadata earlier Video source is same program;
According to the text message element of the new video source captured, the text message element in completion metadata record;
If similarity is not up to criterion, then it is assumed that the new video source captured is wrong source, should give eliminating, by regarding of newly capturing Frequently source is as new program.
9. wrong source based on video text message as claimed in claim 8 investigation method, it is characterised in that: described similarity is sentenced Calibration standard is:
Regard text message element as parameter in a vector respectively;
It is i.e. each parameter in above-mentioned vector to be compared respectively that similarity judges, obtains similar value, then similar value is added Obtain the similarity of the text message of video content for demarcating this video source;
Described similar value is all normalized with similarity, the video content for demarcating this video source finally given The form of the similarity percentage ratio of text message represents.
CN201610554564.9A 2016-07-14 2016-07-14 A kind of wrong source investigation method based on video text message Active CN106131582B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610554564.9A CN106131582B (en) 2016-07-14 2016-07-14 A kind of wrong source investigation method based on video text message

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610554564.9A CN106131582B (en) 2016-07-14 2016-07-14 A kind of wrong source investigation method based on video text message

Publications (2)

Publication Number Publication Date
CN106131582A true CN106131582A (en) 2016-11-16
CN106131582B CN106131582B (en) 2019-09-03

Family

ID=57282671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610554564.9A Active CN106131582B (en) 2016-07-14 2016-07-14 A kind of wrong source investigation method based on video text message

Country Status (1)

Country Link
CN (1) CN106131582B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108989890A (en) * 2018-09-03 2018-12-11 四川长虹电器股份有限公司 Check method in audio-video mistake source based on GStreamer frame
CN110868619A (en) * 2019-11-28 2020-03-06 湖南快乐阳光互动娱乐传媒有限公司 Global video playing record aggregation method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080065693A1 (en) * 2006-09-11 2008-03-13 Bellsouth Intellectual Property Corporation Presenting and linking segments of tagged media files in a media services network
CN103685420A (en) * 2012-09-24 2014-03-26 华为技术有限公司 Method, server and system for media file duplication removal
CN103702168A (en) * 2013-12-12 2014-04-02 乐视网信息技术(北京)股份有限公司 Method of displaying video list and video client
CN105718524A (en) * 2016-01-15 2016-06-29 合一网络技术(北京)有限公司 Method and device for determining video originals

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080065693A1 (en) * 2006-09-11 2008-03-13 Bellsouth Intellectual Property Corporation Presenting and linking segments of tagged media files in a media services network
CN103685420A (en) * 2012-09-24 2014-03-26 华为技术有限公司 Method, server and system for media file duplication removal
CN103702168A (en) * 2013-12-12 2014-04-02 乐视网信息技术(北京)股份有限公司 Method of displaying video list and video client
CN105718524A (en) * 2016-01-15 2016-06-29 合一网络技术(北京)有限公司 Method and device for determining video originals

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108989890A (en) * 2018-09-03 2018-12-11 四川长虹电器股份有限公司 Check method in audio-video mistake source based on GStreamer frame
CN110868619A (en) * 2019-11-28 2020-03-06 湖南快乐阳光互动娱乐传媒有限公司 Global video playing record aggregation method

Also Published As

Publication number Publication date
CN106131582B (en) 2019-09-03

Similar Documents

Publication Publication Date Title
US11244017B2 (en) Content recommendation system with weighted metadata annotations
US8635211B2 (en) Trend analysis in content identification based on fingerprinting
US10860860B1 (en) Matching videos to titles using artificial intelligence
US9756368B2 (en) Methods and apparatus to identify media using hash keys
US9213981B2 (en) Techniques for improving relevance of social updates distributed offline
JP5241832B2 (en) Incremental structure of the search tree including signature pointers for multimedia content identification
US11966404B2 (en) Media names matching and normalization
US8005841B1 (en) Methods, systems, and products for classifying content segments
US20160037232A1 (en) Methods and Systems for Detecting One or More Advertisement Breaks in a Media Content Stream
JP6316409B2 (en) Generate a feed of content items associated with a topic from multiple content sources
US10469918B1 (en) Expanded previously on segments
US10880025B1 (en) Identification of concurrently broadcast time-based media
CN106484774B (en) Correlation method and system for multi-source video metadata
CN104869439B (en) A kind of video pushing method and device
US20120271823A1 (en) Automated discovery of content and metadata
CN111090813B (en) Content processing method and device and computer readable storage medium
US20210385556A1 (en) System and method for identifying altered content
US9948740B1 (en) Caching for multi-protocol media content delivery
US11936932B2 (en) Video analytics system
CN111225246A (en) Video recommendation method and device and electronic equipment
CN106131582A (en) A kind of wrong source based on video text message investigation method
US20180007448A1 (en) System and method for controlling related video content based on domain specific language models
WO2020252783A1 (en) Asset metadata service
US20110252455A1 (en) Method and System for Comparing Media Assets
US20210142354A1 (en) Systems and methods to enable time-based rewards for streaming media consumption

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20170814

Address after: 100039, Yongding Road, Beijing, No. 3, floor 51, 303, Haidian District

Applicant after: CASICLOUD-TECH CO.,LTD.

Address before: 100098 No. 1, building 17, building 2, Wanshou temple, Haidian District, Beijing, No. 35

Applicant before: Xu Shan

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221219

Address after: 100144 1206, Floor 12, Building 7, Yard 49, Badachu Road, Shijingshan District, Beijing

Patentee after: BEIJING CASICLOUD CO.,LTD.

Address before: 100039 303, 3 / F, No.51, Yongding Road, Haidian District, Beijing

Patentee before: CASICLOUD-TECH CO.,LTD.