CN108108482A - A kind of method that the enhancing of scene authenticity is realized in text scape conversion - Google Patents
A kind of method that the enhancing of scene authenticity is realized in text scape conversion Download PDFInfo
- Publication number
- CN108108482A CN108108482A CN201810011163.8A CN201810011163A CN108108482A CN 108108482 A CN108108482 A CN 108108482A CN 201810011163 A CN201810011163 A CN 201810011163A CN 108108482 A CN108108482 A CN 108108482A
- Authority
- CN
- China
- Prior art keywords
- scene
- mrow
- document
- msub
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/358—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of method that the enhancing of scene authenticity is realized in literary scape conversion is claimed in the present invention.This method includes:Step 1 obtains more Chinese documents for describing a certain scene from internet, sets up scene corpus.Step 2:The word segmentation processing of not duplicate removal is carried out to the document in corpus;The document after participle is carried out afterwards to stop word processing.Step 3:Using treated document, statistical analysis is carried out to the substantive noun in scene description.Step 4:Using statistical indicator, scene type feature is analyzed.Step 5:Representation of concept is carried out to the scene using substantive noun, establishes scene conceptual dictionary.Present invention aims at establish entity word to associate with " word class " of scene word, analyze the general feature of " classification ", realize the representation of concept to a certain scene word, the analysis that Scene entity elements are converted for literary scape provides support, meet the common sense cognition of people so as to fulfill the scene of generation, with complete background environment so that the scene sense of reality is enhanced.
Description
Technical field
The invention belongs to the side of scene authenticity enhancing is realized in text visualization field more particularly to a kind of literary scape conversion
Method.
Background technology
Literary scape conversion is the research topic in current more forward position, and literary scape conversion is substantially referred to the symbol of text information
Number it is converted into visual simulation expression.There is the suitable modeling that several reasons promote this transfer process.One significant application
It is the modeling to the psychologic status of people, another significant point contributes to the understanding to a story.3rd relevant
Field is the modeling of cognition, and one segment description is explained using a large amount of different knowledge.It realizes text visualization, not only needs
Meet the description of text, with greater need for the actual conditions for meeting scene and entity.
According to domestic and international scientific and technical literature, the existing research emphasis for literary scape conversion is mainly in research text semantic, analysis
Described in entity the relations such as space.And for text described " scene " without carrying out further investigated.Existing Wen Jingzhuan
System is changed, input text is mostly plain text, and the scene of generation contains only the described entity of text, and scene does not have apparent
Category feature and background environment, without the sense of reality, practical significance is not strong.
Present invention seek to address that above problem of the prior art.Understanding to literary scape converting system, literary scape are converted knowledge
It is associated with scene image, builds the bridge from knowledge to scene.One width scene, by one group of associated scene objects structure
Into there is correspondences with the scenario entities noun in text description for the entity object in the scene image of generation.Scene has
There is class discrimination, different scenes is made of different scene objects, in lit desert scene, can be appreciated that wide desert, withered
Branches and leaves, cactus or camel, without can be appreciated that boundless ocean.This provides the classification of scene the foundation of reality.For field
The classification of scape, in image understanding and classification field, currently a popular mode classification mainly has following 3 kinds:(1) object-based field
Scape is classified;(2) scene classification based on region;(3) scene classification based on context;Object-based scene classification method with
Scene classification method based on context shows the classification by that can realize scene to the research of Scene Semantics object, inhomogeneity
Other scene has different semantic object set.Herein based on this, define scene type semantic object collection and be combined into its class
The representation of concept of other word establishes the conceptual model of classifier-semantic object, from the angle of text, analyzes scene type word and language
Relation between adopted object, from common sense, selective analysis is under a certain scene, and in daily life, which scene can include
Entity.The scene generated to literary scape converting system is supplemented so that scene more meets the common sense cognition of people, realizes scene
Authenticity enhances.
The content of the invention
Present invention seek to address that above problem of the prior art.Propose a kind of side for the authenticity enhancing for realizing scene
Method.Technical scheme is as follows:
A kind of method that the enhancing of scene authenticity is realized in text scape conversion, comprises the following steps:
1) more Chinese documents for describing a certain scene, are obtained from internet, set up scene corpus, the existing pin of the present invention
To Chinese literary scape converting system.
2) word segmentation processing of not duplicate removal, is carried out to the Chinese document collection for describing a certain scene;Then to word segmentation processing after
Chinese document carries out stopping word processing;
3), go to stop word treated word segmentation result using step 2) Chinese document collection, to the physical name in word segmentation result
The method that word utilizes word frequency statistics, obtains the statistical indicator of substantive noun;
4), using step 3) substantive noun statistical indicator, structure document sets correspond to the feature word list of scene type;
5), using the scene type feature word list of step 4), analyze and extract optimal scene type Feature Words, establish field
Scape entity dictionary.
Further, the scene corpus of the step 1) is set up by the document of Same Scene classification, and scene corpus is
Document sets with apparent scene characteristic.
Further, the step 1) Scene concept model is the term vector that is formed using substantive noun Representation of concept, w are carried out to scene typetPresentation-entity noun.Each scene type correspond to one group of relevant word to
Amount defines the threshold value that subscript t is concept dictionary, is also the mould of term vector, by obtaining same category of large volume document, statistics text
Occurrence number is more in shelves and associated with classification C substantive noun composition term vectorM is defined as in fact
The quantity of body noun, withDetermine the scene concept dictionary of scene type C
Further, the step 2) carries out the Chinese document in scene corpus the word segmentation processing of not duplicate removal;Then
Chinese document after word segmentation processing is carried out to stop word processing, is specifically included:
For the multiple documents of acquisition, denoising is carried out to document first, removal document includes advertising words and English
Interior word is linked at, word segmentation processing is carried out using ROST Chinese word segmentations instrument.
Further, the step 3) goes to stop word treated word segmentation result using step 2) Chinese document collection, to point
The method that substantive noun in word result utilizes word frequency statistics, obtains the statistical indicator of substantive noun, specifically includes:
Traditional text feature TFIDF models mainly consider the frequency information TF of characteristic item and anti-document frequency
Rate Information ID F, characteristic item frequency TF refers to the number that characteristic item occurs in a document, for Scene concept model, obtains
The n piece documents of a certain classification C, form document sets A, and the number that substantive noun w occurs in the document sets of classification C is to obtain scene
One of important references of concept dictionary;
For each document sets A, using remove to stop word treated Chinese document as a result, going out in n documents of statistics
Existing substantive noun frequency of occurrences size;
Defined terms wiThe word frequency number f in AiFor
count(wi, A) and/size (A), 0 < fi< 1
count(wi, A) and it is defined as word wiThe number occurred in the document sets of A, size (Ak) it is defined as all entities in A
The sum that noun occurs;
It is calculated again using anti-document frequency IDF, anti-document frequency IDF is amount of the characteristic item in document sets distribution situation
Change, the computational methods of IDF are:Total number of documents is set to N in document sets A, and number of files of the definition comprising word w is n, then model of place
In anti-document frequency be defined as:
Further, the step 4) analyzes scene type feature, tool using the statistical indicator of the substantive noun of step 3)
Body includes;Define the term vector that the list is made of substantive nounRepresentation of concept is carried out to scene type;
It is studied for each document, it is assumed that its scene is made of multiple substantive nouns, then for document sceneWith p (wn) represent scenario entities word wnProbability, generate document described by sceneProbability be:
Based on the document of selection its scene is mostly described with landscape, and it is unique, it is assumed that text due in scene corpus
Shelves first select a scene s, in the entity that the scene according to described by scene generation document needs, the described scene of the document
Uniquely, it is assumed that scene type has s1, s2..., sk, then the probability for generating document scene is:
It is two scene types s1, s2 to t random divisions before the list of scene type Feature Words after selected t values;Then
Probability analysis is carried out to each document, it is assumed that result is for a documentIts generating probability is:
Wherein N=U+V, U represent scene type s1The number of included substantive noun, V represent scene type s2Comprising
Substantive noun number, then the selection of t values is improper.
Further, in the case of there are multiple sub-scenes for first t, the judgements of t values still carries out traversal dichotomy, and two
Point-score intermediate point value range is [2, t-1], and whether Ergodic judgement t values are suitable.
It advantages of the present invention and has the beneficial effect that:
Compared with prior art, the present invention the beneficial effects of the invention are as follows Scene concept model is established, input describes certain
The document sets of one scene type, you can acquisition and the corresponding substantive noun of the scene type establish scene word and physical name
The association on scene type of word.The analysis that Scene entity elements are converted for literary scape provides support, so as to fulfill generation
Scene meets the common sense cognition of people, has complete background environment so that the scene sense of reality is enhanced.
Description of the drawings
Fig. 1 is the flow that the present invention provides the method that Scene concept model is established in a kind of literary scape conversion of preferred embodiment
Figure;
Fig. 2 is the systems function diagram of the application Scene conceptual model.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, detailed
Carefully describe.Described embodiment is only the part of the embodiment of the present invention.
The present invention solve above-mentioned technical problem technical solution be:
A kind of method that the enhancing of scene authenticity is realized in text scape conversion, including:
Step 1 obtains more Chinese documents for describing a certain scene from internet, sets up scene corpus.
Step 2:The word segmentation processing of not duplicate removal is carried out to the document in corpus;The document after participle is gone afterwards
Stop word processing.
Step 3:Using in step 2, Chinese document collection removes to stop word treated word segmentation result, in word segmentation result
The method that substantive noun utilizes word frequency statistics, obtains the statistical indicator of substantive noun;
Step 4:Using the statistical indicator of the substantive noun of step 3, structure document sets correspond to the Feature Words of scene type
List;
Step 5:Using the scene type feature word list of step 4, analyze and extract optimal scene type Feature Words, build
Position scape entity dictionary.
The method that Scene concept model is established in the literary scape conversion, wherein step 1, including:
For a certain scene type C, multiple documents of the description in relation to scene type C are crawled using that internet, it is up to a hundred
A piece is even more, and example uses 200 compositions of the description in relation to " birthday " scene crawled in Baidu writes a composition.
The method that Scene concept model is established in the literary scape conversion, wherein step 2, including:
For the multiple documents of acquisition, denoising is carried out to document first, removes the advertising words in document and English chain
It connects, word segmentation processing is carried out using ROST Chinese word segmentations instrument.
The method that Scene concept model is established in the literary scape conversion, wherein step 3, including:
Step 3 is to carry out the statistical analysis of substantive noun, and traditional text feature provides preferable feature
Extract thinking.TFIDF models mainly consider the frequency information TF and anti-document frequency Information ID F of characteristic item.Characteristic item frequency
(TF) number that characteristic item occurs in a document is referred to.Different classes of document has very on the probability of occurrence of some characteristic items
Big difference.For Scene concept model, the n piece documents of a certain classification C are obtained, form document sets A, substantive noun w is in class
The number occurred in the document sets of other C is one of important references for obtaining scene concept dictionary.
For each document sets A, using step 2 as a result, frequency occurs in the substantive noun occurred in n documents of statistics
Rate size;
Defined terms wiThe word frequency number f in AiFor
count(wi, A) and/size (A), 0 < fi< 1
count(wi, A) and it is defined as word wiThe number occurred in the document sets of A.size(Ak) it is defined as all entities in A
The sum that noun occurs.
The frequency information of substantive noun is not enough to reply all situations, there is a situation where not exclusively to classify, such as physical name
Word w is only present in a document in category documents collection A, and occurrence number is more, then this word may be only with this piece
Article is related, and less with the class relations of document sets A, and Frequency Index is not enough to represent the feature of classification C.Anti- document frequency
(IDF) it is quantization of the characteristic item in document sets distribution situation.The common computational methods of IDF are:
Wherein N is several total number of files of document, nkTo there is characteristic item TkNumber of files.
The core concept of IDF algorithms is that the characteristic item all occurred in most of documents is not so good as only in fraction document
The characteristic item of appearance is important.IDF can weaken the importance of some high frequency characteristics items occurred in most of documents, together
The importance of some characteristics of low-frequency items occurred in fraction document of Shi Zengqiang.And for Scene concept model, for
The data set of given single scene type analyzes feature possessed by scene type, this feature should be in data set document
The characteristic item that the scene is related generally to is described.All occur in most of documents so the feature of Scene concept model obtains
Characteristic item, and filter out the characteristic item repeatedly occurred in fraction document.
Total number of documents is set to N in document sets A, and number of files of the definition comprising word w is n, then the anti-document in model of place
Frequency is defined as:
For example, as follows to the conceptual modeling methods that " birthday " this scene is mentioned in text:
Substantive noun | Word frequency | Total number of documents in document sets | SIDF |
Mother | 403 | 79 | 0.25 |
Cake | 158 | 72 | 0.23 |
Father | 147 | 57 | 0.19 |
Present | 107 | 40 | 0.14 |
Classmate | 77 | 31 | 0.117 |
China | 60 | 7 | 0.029 |
Motherland | 50 | 10 | 0.04 |
Candle | 49 | 27 | 0.103 |
Table (one)
As shown above, for " China " and " motherland " two words repeatedly occurred, under SIDF indexs, value is less than
" candle ", for birthday scene, this variation is favourable.
The method that Scene concept model is established in the literary scape conversion, wherein step 4, including:
Step 4 utilizes statistical indicator, analyzes the feature of scene type.Above-mentioned steps, what is got is special on scene type
Levy the list ordering of word, then should the first few items of selected characteristic word list carry out representation of concept scene type and be one to need to handle
The problem of.Define the term vector that the list is made of substantive nounRepresentation of concept is carried out to scene type.
For Scene concept model, since situation about not exclusively classifying exists, meet for the selection of the length t of term vector
Face Railway Project:
The value of t is too small, and scene type expression can be not accurate enough.
The value of t is excessive, the situation of more scene type mixing.
Analysis situation 2, when t values are excessive, more scene types are present with classification mixing.With above-mentioned table (one) Suo Shi, in profit
After being sorted with SIDF, when t selection initial values are more than 7, scene characteristic mixes, i.e. birthday scene:Mother, father, cake, gift
Object, candle, classmate and the associated description of National Day:China, motherland.So the selection needs of t values make rational processing with sentencing
It is disconnected.
For scene, the scene in document is made of occasion (social environment) and landscape (natural environment), and scene is retouched
Write is that scene describes the synthesis described with landscape.In scene corpus, based on the scene of document description is mostly described with landscape, and it is
Uniquely.
Scene corpus includes more than one scene s, describe the scene s that may be present of the document in relation to the birthday have it is more
It is a, based on the existing document sets on the birthday, it is observed that the scene type of document sets probably includes celebrating a birthday, sighing with deep feeling into
Length misses other people, National Day, sighs with deep feeling college entrance examination etc..From term frequencies with from the point of view of anti-document frequency, to celebrate a birthday as the document of scene
It is in the majority.This provides a resolving ideas.For Scene concept model, the scene document of acquisition is the composition of primary school, small
It is mostly unique to learn composition scene.
For a scene type C, multiple documents d is corresponded to, every document scene is described unique.And scene type
C corresponds to many subdivision sub-scenes, and document is corresponded with scene, then by classifying to subdivision sub-scene, for t values
For, if classification results are 1 classification, then t values are desirable, conversely, classification results are 2 or more than 2, then t's
Value is excessive.
It is studied for each document, it is assumed that its scene is made of multiple substantive nouns, then for document sceneWith p (wn) represent scenario entities word wnProbability, generate document described by sceneProbability be:
Based on the document of selection its scene is mostly described with landscape, and it is unique, it is assumed that text due in scene corpus
Shelves first select a scene s, in the entity that the scene according to described by scene generation document needs, the described scene of the document
Uniquely.Assuming that scene type has s1, s2..., sk, then the probability for generating document scene is:
It is two scene types s1, s2 to t random divisions before the list of scene type Feature Words after selected t values;Then
Probability analysis is carried out to each document, it is assumed that result is for a documentIts generating probability is:
Wherein N=U+V
Then the selection of t values is improper.
Table (two)
As can be seen from Table II, when t values are 7, all documents will not all be expressed as the linear combination of s1 and s2.For preceding
The t situations for multiple sub-scenes occur, the judgements of t values still carry out traversal dichotomy, dichotomy intermediate point value range be [2,
T-1], whether Ergodic judgement t values are suitable.
The above embodiment is interpreted as being merely to illustrate the present invention rather than limit the scope of the invention.
After the content for having read the record of the present invention, technical staff can make various changes or modifications the present invention, these equivalent changes
Change and modification equally falls into the scope of the claims in the present invention.
Claims (7)
1. the method for scene authenticity enhancing is realized in a kind of text scape conversion, which is characterized in that comprise the following steps:
1) more Chinese documents for describing a certain scene, are obtained from internet, set up scene corpus;
2) word segmentation processing of not duplicate removal, is carried out to the Chinese document collection for describing a certain scene;Then to the Chinese after word segmentation processing
Document carries out stopping word processing;
3), go to stop word treated word segmentation result using step 2) Chinese document collection, to the substantive noun profit in word segmentation result
With the method for word frequency statistics, the statistical indicator of substantive noun is obtained;
4), using step 3) substantive noun statistical indicator, structure document sets correspond to the feature word list of scene type;
5), using the scene type feature word list of step 4), analyze and extract optimal scene type Feature Words, establish scene reality
Pronouns, general term for nouns, numerals and measure words allusion quotation.
2. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute
The scene corpus for stating step 1) is set up by the document of Same Scene classification, and scene corpus is the text with apparent scene characteristic
Shelves collection.
3. the method for scene authenticity enhancing, feature are realized in a kind of literary scape conversion according to one of claim 1-2
It is, step 1) the scenario entities model is the term vector that is formed using substantive noun To scene type
Carry out entitative concept expression, wtPresentation-entity noun, each scene type correspond to one group of relevant term vector, define subscript t and are
The threshold value of concept dictionary is also the mould of term vector, by obtaining same category of large volume document, in statistic document occurrence number compared with
Substantive noun composition term vector more and associated with classification CThe quantity that m is substantive noun is defined,
WithDetermine the scenario entities dictionary of scene type C
4. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute
State the word segmentation processing that step 2) carries out the Chinese document in scene corpus not duplicate removal;Then to the Chinese text after word segmentation processing
Shelves carry out stopping word processing, specifically include:
For the multiple documents of acquisition, denoising is carried out to document first, removal document includes advertising words and linked with English
Word inside carries out word segmentation processing using ROST Chinese word segmentations instrument.
5. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute
The method that step 3) utilizes the substantive noun in word segmentation result word frequency statistics is stated, obtains the statistical indicator of substantive noun, specifically
Including:
Traditional text feature TFIDF models mainly consider the frequency information TF of characteristic item and anti-document frequency letter
IDF is ceased, characteristic item frequency TF refers to the number that characteristic item occurs in a document, for Scene concept model, obtains a certain
The n piece documents of classification C, form document sets A, and the number that substantive noun w occurs in the document sets of classification C is to obtain scene concept
One of important references of dictionary;
For each document sets A, using removing to stop word treated Chinese document as a result, occurring in n documents of statistics
Substantive noun frequency of occurrences size;
Defined terms wiThe word frequency number f in AiFor
count(wi, A) and/size (A), 0 < fi< 1
count(wi, A) and it is defined as word wiThe number occurred in the document sets of A, size (Ak) it is defined as all substantive nouns in A
The sum of appearance;
It being calculated again using anti-document frequency IDF, anti-document frequency IDF is quantization of the characteristic item in document sets distribution situation,
The computational methods of IDF are:Total number of documents is set to N in document sets A, and number of files of the definition comprising word w is n, then in model of place
Anti- document frequency be defined as:
<mrow>
<mi>s</mi>
<mi>i</mi>
<mi>d</mi>
<mi>f</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>l</mi>
<mi>o</mi>
<mi>g</mi>
<mrow>
<mo>(</mo>
<mfrac>
<mi>n</mi>
<mi>N</mi>
</mfrac>
<mo>+</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
<mo>.</mo>
</mrow>
6. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute
Statistical indicator of the step 4) using the substantive noun of step 3) is stated, scene type feature is analyzed, specifically includes;Defining the list is
The term vector that substantive noun is formedRepresentation of concept is carried out to scene type;
It is studied for each document, it is assumed that its scene is made of multiple substantive nouns, then for document sceneWith p (wn) represent scenario entities word wnProbability, generate document described by sceneProbability be:
<mrow>
<mi>p</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>w</mi>
<mo>&RightArrow;</mo>
</mover>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>n</mi>
</msub>
<mo>)</mo>
</mrow>
</mrow>
Based on the document of selection its scene is mostly described with landscape, and it is unique, it is assumed that give document first due in scene corpus
A scene s is selected, in the entity that the scene according to described by scene generation document needs, the described scene of the document is only
One, it is assumed that scene type has s1, s2..., sk, then the probability for generating document scene is:
<mrow>
<mi>p</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>w</mi>
<mo>&RightArrow;</mo>
</mover>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>n</mi>
</msub>
<mo>|</mo>
<msub>
<mi>s</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mo>...</mo>
<mo>+</mo>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mi>k</mi>
</msub>
<mo>)</mo>
</mrow>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>n</mi>
</msub>
<mo>|</mo>
<msub>
<mi>s</mi>
<mi>k</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munder>
<mo>&Sigma;</mo>
<mi>s</mi>
</munder>
<mi>p</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>)</mo>
</mrow>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>n</mi>
</msub>
<mo>|</mo>
<mi>s</mi>
<mo>)</mo>
</mrow>
</mrow>
It is two scene type s to t random divisions before the list of scene type Feature Words after selected t values1, s2;Then to every
One document carries out probability analysis, it is assumed that result is for a documentIts generating probability is:
<mrow>
<mi>p</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>w</mi>
<mo>&RightArrow;</mo>
</mover>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>U</mi>
</munderover>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>n</mi>
</msub>
<mo>|</mo>
<msub>
<mi>s</mi>
<mn>1</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mn>2</mn>
</msub>
<mo>)</mo>
</mrow>
<munderover>
<mo>&Pi;</mo>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>V</mi>
</munderover>
<mi>p</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>w</mi>
<mi>n</mi>
</msub>
<mo>|</mo>
<msub>
<mi>s</mi>
<mn>2</mn>
</msub>
<mo>)</mo>
</mrow>
</mrow>
Wherein N=U+V, U represent scene type s1The number of included substantive noun, V represent scene type s2Comprising entity
Noun number, then the selection of t values is improper.
7. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 6, which is characterized in that right
Occurs the situation of multiple sub-scenes in first t, the judgement of t values still carries out traversal dichotomy, dichotomy intermediate point value range
For [2, t-1], whether Ergodic judgement t values are suitable.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810011163.8A CN108108482B (en) | 2018-01-05 | 2018-01-05 | Method for realizing scene reality enhancement in scene conversion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810011163.8A CN108108482B (en) | 2018-01-05 | 2018-01-05 | Method for realizing scene reality enhancement in scene conversion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108108482A true CN108108482A (en) | 2018-06-01 |
CN108108482B CN108108482B (en) | 2022-02-11 |
Family
ID=62219845
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810011163.8A Active CN108108482B (en) | 2018-01-05 | 2018-01-05 | Method for realizing scene reality enhancement in scene conversion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108482B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110688483A (en) * | 2019-09-16 | 2020-01-14 | 重庆邮电大学 | Dictionary-based noun visibility labeling method, medium and system in context conversion |
CN111310444A (en) * | 2020-01-16 | 2020-06-19 | 北京大学 | Park landscape service identification method |
CN111814538A (en) * | 2020-05-25 | 2020-10-23 | 北京达佳互联信息技术有限公司 | Target object type identification method and device, electronic equipment and storage medium |
CN112257386A (en) * | 2020-10-26 | 2021-01-22 | 重庆邮电大学 | Method for generating scene space relation information layout in scene conversion |
CN116432623A (en) * | 2023-04-14 | 2023-07-14 | 嘉兴九州文化传媒有限公司 | Film and television shooting information management method for digital analysis |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7016828B1 (en) * | 2000-10-23 | 2006-03-21 | At&T Corp. | Text-to-scene conversion |
US7664313B1 (en) * | 2000-10-23 | 2010-02-16 | At&T Intellectual Property Ii, L.P. | Text-to scene conversion |
US20120330869A1 (en) * | 2011-06-25 | 2012-12-27 | Jayson Theordore Durham | Mental Model Elicitation Device (MMED) Methods and Apparatus |
CN105069716A (en) * | 2015-07-28 | 2015-11-18 | 史喻 | Sight spot information push method satisfying user consultation |
CN107220321A (en) * | 2017-05-19 | 2017-09-29 | 重庆邮电大学 | Solid threedimensional embodies in a kind of literary scape conversion method and its system |
-
2018
- 2018-01-05 CN CN201810011163.8A patent/CN108108482B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7016828B1 (en) * | 2000-10-23 | 2006-03-21 | At&T Corp. | Text-to-scene conversion |
US7664313B1 (en) * | 2000-10-23 | 2010-02-16 | At&T Intellectual Property Ii, L.P. | Text-to scene conversion |
US20120330869A1 (en) * | 2011-06-25 | 2012-12-27 | Jayson Theordore Durham | Mental Model Elicitation Device (MMED) Methods and Apparatus |
CN105069716A (en) * | 2015-07-28 | 2015-11-18 | 史喻 | Sight spot information push method satisfying user consultation |
CN107220321A (en) * | 2017-05-19 | 2017-09-29 | 重庆邮电大学 | Solid threedimensional embodies in a kind of literary scape conversion method and its system |
Non-Patent Citations (2)
Title |
---|
FUPING YANG: "Preliminary Implementation of Text-to-scene System", 《INTERNATIONAL CONFERENCE ON INFORMATION SCIENCES, MACHINERY, MATERIALS AND ENERGY2015》 * |
FUPING YANG: "Scene Layout in Text-to-Scene Conversion", 《ICSAI 2014》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110688483A (en) * | 2019-09-16 | 2020-01-14 | 重庆邮电大学 | Dictionary-based noun visibility labeling method, medium and system in context conversion |
CN110688483B (en) * | 2019-09-16 | 2022-10-18 | 重庆邮电大学 | Dictionary-based noun visibility labeling method, medium and system in context conversion |
CN111310444A (en) * | 2020-01-16 | 2020-06-19 | 北京大学 | Park landscape service identification method |
CN111814538A (en) * | 2020-05-25 | 2020-10-23 | 北京达佳互联信息技术有限公司 | Target object type identification method and device, electronic equipment and storage medium |
CN111814538B (en) * | 2020-05-25 | 2024-03-05 | 北京达佳互联信息技术有限公司 | Method and device for identifying category of target object, electronic equipment and storage medium |
CN112257386A (en) * | 2020-10-26 | 2021-01-22 | 重庆邮电大学 | Method for generating scene space relation information layout in scene conversion |
CN112257386B (en) * | 2020-10-26 | 2023-09-26 | 重庆邮电大学 | Method for generating scene space relation information layout in text-to-scene conversion |
CN116432623A (en) * | 2023-04-14 | 2023-07-14 | 嘉兴九州文化传媒有限公司 | Film and television shooting information management method for digital analysis |
CN116432623B (en) * | 2023-04-14 | 2023-09-22 | 嘉兴九州文化传媒有限公司 | Film and television shooting information management method for digital analysis |
Also Published As
Publication number | Publication date |
---|---|
CN108108482B (en) | 2022-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108108482A (en) | A kind of method that the enhancing of scene authenticity is realized in text scape conversion | |
Omar et al. | Multi-label arabic text classification in online social networks | |
CN108052593B (en) | Topic keyword extraction method based on topic word vector and network structure | |
Li et al. | Data sets: Word embeddings learned from tweets and general data | |
CN104636402B (en) | A kind of classification of business object, search, method for pushing and system | |
CN103678670B (en) | Micro-blog hot word and hot topic mining system and method | |
CN106407169B (en) | A kind of document marking method based on topic model | |
CN103116637A (en) | Text sentiment classification method facing Chinese Web comments | |
CN108920482B (en) | Microblog short text classification method based on lexical chain feature extension and LDA (latent Dirichlet Allocation) model | |
CN110287329B (en) | E-commerce category attribute mining method based on commodity text classification | |
CN104484343A (en) | Topic detection and tracking method for microblog | |
CN110598219A (en) | Emotion analysis method for broad-bean-net movie comment | |
CN109214454B (en) | Microblog-oriented emotion community classification method | |
CN107273913A (en) | A kind of short text similarity calculating method based on multi-feature fusion | |
CN103092966A (en) | Vocabulary mining method and device | |
CN112328794B (en) | Typhoon event information aggregation method | |
CN107832307B (en) | Chinese word segmentation method based on undirected graph and single-layer neural network | |
Samonte et al. | Sentiment and opinion analysis on Twitter about local airlines | |
CN106777193A (en) | A kind of method for writing specific contribution automatically | |
Peterlin et al. | Automated content analysis: The review of the big data systemic discourse in tourism and hospitality | |
CN105354184A (en) | Method for using optimized vector space model to automatically classify document | |
CN110019820A (en) | Main suit and present illness history symptom Timing Coincidence Detection method in a kind of case history | |
CN107122465A (en) | The construction method and system of a kind of Tibetan language sentiment dictionary based on Tibetan language language feature | |
Prayoga et al. | Unsupervised Twitter Sentiment Analysis on The Revision of Indonesian Code Law and the Anti-Corruption Law using Combination Method of Lexicon Based and Agglomerative Hierarchical Clustering | |
Lin et al. | Digital library information integration system based on big data and deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |