CN108108482A - A kind of method that the enhancing of scene authenticity is realized in text scape conversion - Google Patents

A kind of method that the enhancing of scene authenticity is realized in text scape conversion Download PDF

Info

Publication number
CN108108482A
CN108108482A CN201810011163.8A CN201810011163A CN108108482A CN 108108482 A CN108108482 A CN 108108482A CN 201810011163 A CN201810011163 A CN 201810011163A CN 108108482 A CN108108482 A CN 108108482A
Authority
CN
China
Prior art keywords
scene
mrow
document
msub
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810011163.8A
Other languages
Chinese (zh)
Other versions
CN108108482B (en
Inventor
杨富平
刘凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201810011163.8A priority Critical patent/CN108108482B/en
Publication of CN108108482A publication Critical patent/CN108108482A/en
Application granted granted Critical
Publication of CN108108482B publication Critical patent/CN108108482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/358Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of method that the enhancing of scene authenticity is realized in literary scape conversion is claimed in the present invention.This method includes:Step 1 obtains more Chinese documents for describing a certain scene from internet, sets up scene corpus.Step 2:The word segmentation processing of not duplicate removal is carried out to the document in corpus;The document after participle is carried out afterwards to stop word processing.Step 3:Using treated document, statistical analysis is carried out to the substantive noun in scene description.Step 4:Using statistical indicator, scene type feature is analyzed.Step 5:Representation of concept is carried out to the scene using substantive noun, establishes scene conceptual dictionary.Present invention aims at establish entity word to associate with " word class " of scene word, analyze the general feature of " classification ", realize the representation of concept to a certain scene word, the analysis that Scene entity elements are converted for literary scape provides support, meet the common sense cognition of people so as to fulfill the scene of generation, with complete background environment so that the scene sense of reality is enhanced.

Description

A kind of method that the enhancing of scene authenticity is realized in text scape conversion
Technical field
The invention belongs to the side of scene authenticity enhancing is realized in text visualization field more particularly to a kind of literary scape conversion Method.
Background technology
Literary scape conversion is the research topic in current more forward position, and literary scape conversion is substantially referred to the symbol of text information Number it is converted into visual simulation expression.There is the suitable modeling that several reasons promote this transfer process.One significant application It is the modeling to the psychologic status of people, another significant point contributes to the understanding to a story.3rd relevant Field is the modeling of cognition, and one segment description is explained using a large amount of different knowledge.It realizes text visualization, not only needs Meet the description of text, with greater need for the actual conditions for meeting scene and entity.
According to domestic and international scientific and technical literature, the existing research emphasis for literary scape conversion is mainly in research text semantic, analysis Described in entity the relations such as space.And for text described " scene " without carrying out further investigated.Existing Wen Jingzhuan System is changed, input text is mostly plain text, and the scene of generation contains only the described entity of text, and scene does not have apparent Category feature and background environment, without the sense of reality, practical significance is not strong.
Present invention seek to address that above problem of the prior art.Understanding to literary scape converting system, literary scape are converted knowledge It is associated with scene image, builds the bridge from knowledge to scene.One width scene, by one group of associated scene objects structure Into there is correspondences with the scenario entities noun in text description for the entity object in the scene image of generation.Scene has There is class discrimination, different scenes is made of different scene objects, in lit desert scene, can be appreciated that wide desert, withered Branches and leaves, cactus or camel, without can be appreciated that boundless ocean.This provides the classification of scene the foundation of reality.For field The classification of scape, in image understanding and classification field, currently a popular mode classification mainly has following 3 kinds:(1) object-based field Scape is classified;(2) scene classification based on region;(3) scene classification based on context;Object-based scene classification method with Scene classification method based on context shows the classification by that can realize scene to the research of Scene Semantics object, inhomogeneity Other scene has different semantic object set.Herein based on this, define scene type semantic object collection and be combined into its class The representation of concept of other word establishes the conceptual model of classifier-semantic object, from the angle of text, analyzes scene type word and language Relation between adopted object, from common sense, selective analysis is under a certain scene, and in daily life, which scene can include Entity.The scene generated to literary scape converting system is supplemented so that scene more meets the common sense cognition of people, realizes scene Authenticity enhances.
The content of the invention
Present invention seek to address that above problem of the prior art.Propose a kind of side for the authenticity enhancing for realizing scene Method.Technical scheme is as follows:
A kind of method that the enhancing of scene authenticity is realized in text scape conversion, comprises the following steps:
1) more Chinese documents for describing a certain scene, are obtained from internet, set up scene corpus, the existing pin of the present invention To Chinese literary scape converting system.
2) word segmentation processing of not duplicate removal, is carried out to the Chinese document collection for describing a certain scene;Then to word segmentation processing after Chinese document carries out stopping word processing;
3), go to stop word treated word segmentation result using step 2) Chinese document collection, to the physical name in word segmentation result The method that word utilizes word frequency statistics, obtains the statistical indicator of substantive noun;
4), using step 3) substantive noun statistical indicator, structure document sets correspond to the feature word list of scene type;
5), using the scene type feature word list of step 4), analyze and extract optimal scene type Feature Words, establish field Scape entity dictionary.
Further, the scene corpus of the step 1) is set up by the document of Same Scene classification, and scene corpus is Document sets with apparent scene characteristic.
Further, the step 1) Scene concept model is the term vector that is formed using substantive noun Representation of concept, w are carried out to scene typetPresentation-entity noun.Each scene type correspond to one group of relevant word to Amount defines the threshold value that subscript t is concept dictionary, is also the mould of term vector, by obtaining same category of large volume document, statistics text Occurrence number is more in shelves and associated with classification C substantive noun composition term vectorM is defined as in fact The quantity of body noun, withDetermine the scene concept dictionary of scene type C
Further, the step 2) carries out the Chinese document in scene corpus the word segmentation processing of not duplicate removal;Then Chinese document after word segmentation processing is carried out to stop word processing, is specifically included:
For the multiple documents of acquisition, denoising is carried out to document first, removal document includes advertising words and English Interior word is linked at, word segmentation processing is carried out using ROST Chinese word segmentations instrument.
Further, the step 3) goes to stop word treated word segmentation result using step 2) Chinese document collection, to point The method that substantive noun in word result utilizes word frequency statistics, obtains the statistical indicator of substantive noun, specifically includes:
Traditional text feature TFIDF models mainly consider the frequency information TF of characteristic item and anti-document frequency Rate Information ID F, characteristic item frequency TF refers to the number that characteristic item occurs in a document, for Scene concept model, obtains The n piece documents of a certain classification C, form document sets A, and the number that substantive noun w occurs in the document sets of classification C is to obtain scene One of important references of concept dictionary;
For each document sets A, using remove to stop word treated Chinese document as a result, going out in n documents of statistics Existing substantive noun frequency of occurrences size;
Defined terms wiThe word frequency number f in AiFor
count(wi, A) and/size (A), 0 < fi< 1
count(wi, A) and it is defined as word wiThe number occurred in the document sets of A, size (Ak) it is defined as all entities in A The sum that noun occurs;
It is calculated again using anti-document frequency IDF, anti-document frequency IDF is amount of the characteristic item in document sets distribution situation Change, the computational methods of IDF are:Total number of documents is set to N in document sets A, and number of files of the definition comprising word w is n, then model of place In anti-document frequency be defined as:
Further, the step 4) analyzes scene type feature, tool using the statistical indicator of the substantive noun of step 3) Body includes;Define the term vector that the list is made of substantive nounRepresentation of concept is carried out to scene type;
It is studied for each document, it is assumed that its scene is made of multiple substantive nouns, then for document sceneWith p (wn) represent scenario entities word wnProbability, generate document described by sceneProbability be:
Based on the document of selection its scene is mostly described with landscape, and it is unique, it is assumed that text due in scene corpus Shelves first select a scene s, in the entity that the scene according to described by scene generation document needs, the described scene of the document Uniquely, it is assumed that scene type has s1, s2..., sk, then the probability for generating document scene is:
It is two scene types s1, s2 to t random divisions before the list of scene type Feature Words after selected t values;Then Probability analysis is carried out to each document, it is assumed that result is for a documentIts generating probability is:
Wherein N=U+V, U represent scene type s1The number of included substantive noun, V represent scene type s2Comprising Substantive noun number, then the selection of t values is improper.
Further, in the case of there are multiple sub-scenes for first t, the judgements of t values still carries out traversal dichotomy, and two Point-score intermediate point value range is [2, t-1], and whether Ergodic judgement t values are suitable.
It advantages of the present invention and has the beneficial effect that:
Compared with prior art, the present invention the beneficial effects of the invention are as follows Scene concept model is established, input describes certain The document sets of one scene type, you can acquisition and the corresponding substantive noun of the scene type establish scene word and physical name The association on scene type of word.The analysis that Scene entity elements are converted for literary scape provides support, so as to fulfill generation Scene meets the common sense cognition of people, has complete background environment so that the scene sense of reality is enhanced.
Description of the drawings
Fig. 1 is the flow that the present invention provides the method that Scene concept model is established in a kind of literary scape conversion of preferred embodiment Figure;
Fig. 2 is the systems function diagram of the application Scene conceptual model.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, detailed Carefully describe.Described embodiment is only the part of the embodiment of the present invention.
The present invention solve above-mentioned technical problem technical solution be:
A kind of method that the enhancing of scene authenticity is realized in text scape conversion, including:
Step 1 obtains more Chinese documents for describing a certain scene from internet, sets up scene corpus.
Step 2:The word segmentation processing of not duplicate removal is carried out to the document in corpus;The document after participle is gone afterwards Stop word processing.
Step 3:Using in step 2, Chinese document collection removes to stop word treated word segmentation result, in word segmentation result The method that substantive noun utilizes word frequency statistics, obtains the statistical indicator of substantive noun;
Step 4:Using the statistical indicator of the substantive noun of step 3, structure document sets correspond to the Feature Words of scene type List;
Step 5:Using the scene type feature word list of step 4, analyze and extract optimal scene type Feature Words, build Position scape entity dictionary.
The method that Scene concept model is established in the literary scape conversion, wherein step 1, including:
For a certain scene type C, multiple documents of the description in relation to scene type C are crawled using that internet, it is up to a hundred A piece is even more, and example uses 200 compositions of the description in relation to " birthday " scene crawled in Baidu writes a composition.
The method that Scene concept model is established in the literary scape conversion, wherein step 2, including:
For the multiple documents of acquisition, denoising is carried out to document first, removes the advertising words in document and English chain It connects, word segmentation processing is carried out using ROST Chinese word segmentations instrument.
The method that Scene concept model is established in the literary scape conversion, wherein step 3, including:
Step 3 is to carry out the statistical analysis of substantive noun, and traditional text feature provides preferable feature Extract thinking.TFIDF models mainly consider the frequency information TF and anti-document frequency Information ID F of characteristic item.Characteristic item frequency (TF) number that characteristic item occurs in a document is referred to.Different classes of document has very on the probability of occurrence of some characteristic items Big difference.For Scene concept model, the n piece documents of a certain classification C are obtained, form document sets A, substantive noun w is in class The number occurred in the document sets of other C is one of important references for obtaining scene concept dictionary.
For each document sets A, using step 2 as a result, frequency occurs in the substantive noun occurred in n documents of statistics Rate size;
Defined terms wiThe word frequency number f in AiFor
count(wi, A) and/size (A), 0 < fi< 1
count(wi, A) and it is defined as word wiThe number occurred in the document sets of A.size(Ak) it is defined as all entities in A The sum that noun occurs.
The frequency information of substantive noun is not enough to reply all situations, there is a situation where not exclusively to classify, such as physical name Word w is only present in a document in category documents collection A, and occurrence number is more, then this word may be only with this piece Article is related, and less with the class relations of document sets A, and Frequency Index is not enough to represent the feature of classification C.Anti- document frequency (IDF) it is quantization of the characteristic item in document sets distribution situation.The common computational methods of IDF are:
Wherein N is several total number of files of document, nkTo there is characteristic item TkNumber of files.
The core concept of IDF algorithms is that the characteristic item all occurred in most of documents is not so good as only in fraction document The characteristic item of appearance is important.IDF can weaken the importance of some high frequency characteristics items occurred in most of documents, together The importance of some characteristics of low-frequency items occurred in fraction document of Shi Zengqiang.And for Scene concept model, for The data set of given single scene type analyzes feature possessed by scene type, this feature should be in data set document The characteristic item that the scene is related generally to is described.All occur in most of documents so the feature of Scene concept model obtains Characteristic item, and filter out the characteristic item repeatedly occurred in fraction document.
Total number of documents is set to N in document sets A, and number of files of the definition comprising word w is n, then the anti-document in model of place Frequency is defined as:
For example, as follows to the conceptual modeling methods that " birthday " this scene is mentioned in text:
Substantive noun Word frequency Total number of documents in document sets SIDF
Mother 403 79 0.25
Cake 158 72 0.23
Father 147 57 0.19
Present 107 40 0.14
Classmate 77 31 0.117
China 60 7 0.029
Motherland 50 10 0.04
Candle 49 27 0.103
Table (one)
As shown above, for " China " and " motherland " two words repeatedly occurred, under SIDF indexs, value is less than " candle ", for birthday scene, this variation is favourable.
The method that Scene concept model is established in the literary scape conversion, wherein step 4, including:
Step 4 utilizes statistical indicator, analyzes the feature of scene type.Above-mentioned steps, what is got is special on scene type Levy the list ordering of word, then should the first few items of selected characteristic word list carry out representation of concept scene type and be one to need to handle The problem of.Define the term vector that the list is made of substantive nounRepresentation of concept is carried out to scene type.
For Scene concept model, since situation about not exclusively classifying exists, meet for the selection of the length t of term vector Face Railway Project:
The value of t is too small, and scene type expression can be not accurate enough.
The value of t is excessive, the situation of more scene type mixing.
Analysis situation 2, when t values are excessive, more scene types are present with classification mixing.With above-mentioned table (one) Suo Shi, in profit After being sorted with SIDF, when t selection initial values are more than 7, scene characteristic mixes, i.e. birthday scene:Mother, father, cake, gift Object, candle, classmate and the associated description of National Day:China, motherland.So the selection needs of t values make rational processing with sentencing It is disconnected.
For scene, the scene in document is made of occasion (social environment) and landscape (natural environment), and scene is retouched Write is that scene describes the synthesis described with landscape.In scene corpus, based on the scene of document description is mostly described with landscape, and it is Uniquely.
Scene corpus includes more than one scene s, describe the scene s that may be present of the document in relation to the birthday have it is more It is a, based on the existing document sets on the birthday, it is observed that the scene type of document sets probably includes celebrating a birthday, sighing with deep feeling into Length misses other people, National Day, sighs with deep feeling college entrance examination etc..From term frequencies with from the point of view of anti-document frequency, to celebrate a birthday as the document of scene It is in the majority.This provides a resolving ideas.For Scene concept model, the scene document of acquisition is the composition of primary school, small It is mostly unique to learn composition scene.
For a scene type C, multiple documents d is corresponded to, every document scene is described unique.And scene type C corresponds to many subdivision sub-scenes, and document is corresponded with scene, then by classifying to subdivision sub-scene, for t values For, if classification results are 1 classification, then t values are desirable, conversely, classification results are 2 or more than 2, then t's Value is excessive.
It is studied for each document, it is assumed that its scene is made of multiple substantive nouns, then for document sceneWith p (wn) represent scenario entities word wnProbability, generate document described by sceneProbability be:
Based on the document of selection its scene is mostly described with landscape, and it is unique, it is assumed that text due in scene corpus Shelves first select a scene s, in the entity that the scene according to described by scene generation document needs, the described scene of the document Uniquely.Assuming that scene type has s1, s2..., sk, then the probability for generating document scene is:
It is two scene types s1, s2 to t random divisions before the list of scene type Feature Words after selected t values;Then Probability analysis is carried out to each document, it is assumed that result is for a documentIts generating probability is:
Wherein N=U+V
Then the selection of t values is improper.
Table (two)
As can be seen from Table II, when t values are 7, all documents will not all be expressed as the linear combination of s1 and s2.For preceding The t situations for multiple sub-scenes occur, the judgements of t values still carry out traversal dichotomy, dichotomy intermediate point value range be [2, T-1], whether Ergodic judgement t values are suitable.
The above embodiment is interpreted as being merely to illustrate the present invention rather than limit the scope of the invention. After the content for having read the record of the present invention, technical staff can make various changes or modifications the present invention, these equivalent changes Change and modification equally falls into the scope of the claims in the present invention.

Claims (7)

1. the method for scene authenticity enhancing is realized in a kind of text scape conversion, which is characterized in that comprise the following steps:
1) more Chinese documents for describing a certain scene, are obtained from internet, set up scene corpus;
2) word segmentation processing of not duplicate removal, is carried out to the Chinese document collection for describing a certain scene;Then to the Chinese after word segmentation processing Document carries out stopping word processing;
3), go to stop word treated word segmentation result using step 2) Chinese document collection, to the substantive noun profit in word segmentation result With the method for word frequency statistics, the statistical indicator of substantive noun is obtained;
4), using step 3) substantive noun statistical indicator, structure document sets correspond to the feature word list of scene type;
5), using the scene type feature word list of step 4), analyze and extract optimal scene type Feature Words, establish scene reality Pronouns, general term for nouns, numerals and measure words allusion quotation.
2. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute The scene corpus for stating step 1) is set up by the document of Same Scene classification, and scene corpus is the text with apparent scene characteristic Shelves collection.
3. the method for scene authenticity enhancing, feature are realized in a kind of literary scape conversion according to one of claim 1-2 It is, step 1) the scenario entities model is the term vector that is formed using substantive noun To scene type Carry out entitative concept expression, wtPresentation-entity noun, each scene type correspond to one group of relevant term vector, define subscript t and are The threshold value of concept dictionary is also the mould of term vector, by obtaining same category of large volume document, in statistic document occurrence number compared with Substantive noun composition term vector more and associated with classification CThe quantity that m is substantive noun is defined, WithDetermine the scenario entities dictionary of scene type C
4. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute State the word segmentation processing that step 2) carries out the Chinese document in scene corpus not duplicate removal;Then to the Chinese text after word segmentation processing Shelves carry out stopping word processing, specifically include:
For the multiple documents of acquisition, denoising is carried out to document first, removal document includes advertising words and linked with English Word inside carries out word segmentation processing using ROST Chinese word segmentations instrument.
5. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute The method that step 3) utilizes the substantive noun in word segmentation result word frequency statistics is stated, obtains the statistical indicator of substantive noun, specifically Including:
Traditional text feature TFIDF models mainly consider the frequency information TF of characteristic item and anti-document frequency letter IDF is ceased, characteristic item frequency TF refers to the number that characteristic item occurs in a document, for Scene concept model, obtains a certain The n piece documents of classification C, form document sets A, and the number that substantive noun w occurs in the document sets of classification C is to obtain scene concept One of important references of dictionary;
For each document sets A, using removing to stop word treated Chinese document as a result, occurring in n documents of statistics Substantive noun frequency of occurrences size;
Defined terms wiThe word frequency number f in AiFor
count(wi, A) and/size (A), 0 < fi< 1
count(wi, A) and it is defined as word wiThe number occurred in the document sets of A, size (Ak) it is defined as all substantive nouns in A The sum of appearance;
It being calculated again using anti-document frequency IDF, anti-document frequency IDF is quantization of the characteristic item in document sets distribution situation, The computational methods of IDF are:Total number of documents is set to N in document sets A, and number of files of the definition comprising word w is n, then in model of place Anti- document frequency be defined as:
<mrow> <mi>s</mi> <mi>i</mi> <mi>d</mi> <mi>f</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>l</mi> <mi>o</mi> <mi>g</mi> <mrow> <mo>(</mo> <mfrac> <mi>n</mi> <mi>N</mi> </mfrac> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>.</mo> </mrow>
6. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 1, which is characterized in that institute Statistical indicator of the step 4) using the substantive noun of step 3) is stated, scene type feature is analyzed, specifically includes;Defining the list is The term vector that substantive noun is formedRepresentation of concept is carried out to scene type;
It is studied for each document, it is assumed that its scene is made of multiple substantive nouns, then for document sceneWith p (wn) represent scenario entities word wnProbability, generate document described by sceneProbability be:
<mrow> <mi>p</mi> <mrow> <mo>(</mo> <mover> <mi>w</mi> <mo>&amp;RightArrow;</mo> </mover> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&amp;Pi;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>n</mi> </msub> <mo>)</mo> </mrow> </mrow>
Based on the document of selection its scene is mostly described with landscape, and it is unique, it is assumed that give document first due in scene corpus A scene s is selected, in the entity that the scene according to described by scene generation document needs, the described scene of the document is only One, it is assumed that scene type has s1, s2..., sk, then the probability for generating document scene is:
<mrow> <mi>p</mi> <mrow> <mo>(</mo> <mover> <mi>w</mi> <mo>&amp;RightArrow;</mo> </mover> <mo>)</mo> </mrow> <mo>=</mo> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <munderover> <mo>&amp;Pi;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>n</mi> </msub> <mo>|</mo> <msub> <mi>s</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>+</mo> <mo>...</mo> <mo>+</mo> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> <munderover> <mo>&amp;Pi;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>n</mi> </msub> <mo>|</mo> <msub> <mi>s</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <mi>s</mi> </munder> <mi>p</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>)</mo> </mrow> <munderover> <mo>&amp;Pi;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>n</mi> </msub> <mo>|</mo> <mi>s</mi> <mo>)</mo> </mrow> </mrow>
It is two scene type s to t random divisions before the list of scene type Feature Words after selected t values1, s2;Then to every One document carries out probability analysis, it is assumed that result is for a documentIts generating probability is:
<mrow> <mi>p</mi> <mrow> <mo>(</mo> <mover> <mi>w</mi> <mo>&amp;RightArrow;</mo> </mover> <mo>)</mo> </mrow> <mo>=</mo> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <munderover> <mo>&amp;Pi;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>U</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>n</mi> </msub> <mo>|</mo> <msub> <mi>s</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>+</mo> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <munderover> <mo>&amp;Pi;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>V</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>n</mi> </msub> <mo>|</mo> <msub> <mi>s</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> </mrow>
Wherein N=U+V, U represent scene type s1The number of included substantive noun, V represent scene type s2Comprising entity Noun number, then the selection of t values is improper.
7. the method for scene authenticity enhancing is realized in a kind of literary scape conversion according to claim 6, which is characterized in that right Occurs the situation of multiple sub-scenes in first t, the judgement of t values still carries out traversal dichotomy, dichotomy intermediate point value range For [2, t-1], whether Ergodic judgement t values are suitable.
CN201810011163.8A 2018-01-05 2018-01-05 Method for realizing scene reality enhancement in scene conversion Active CN108108482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810011163.8A CN108108482B (en) 2018-01-05 2018-01-05 Method for realizing scene reality enhancement in scene conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810011163.8A CN108108482B (en) 2018-01-05 2018-01-05 Method for realizing scene reality enhancement in scene conversion

Publications (2)

Publication Number Publication Date
CN108108482A true CN108108482A (en) 2018-06-01
CN108108482B CN108108482B (en) 2022-02-11

Family

ID=62219845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810011163.8A Active CN108108482B (en) 2018-01-05 2018-01-05 Method for realizing scene reality enhancement in scene conversion

Country Status (1)

Country Link
CN (1) CN108108482B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110688483A (en) * 2019-09-16 2020-01-14 重庆邮电大学 Dictionary-based noun visibility labeling method, medium and system in context conversion
CN111310444A (en) * 2020-01-16 2020-06-19 北京大学 Park landscape service identification method
CN111814538A (en) * 2020-05-25 2020-10-23 北京达佳互联信息技术有限公司 Target object type identification method and device, electronic equipment and storage medium
CN112257386A (en) * 2020-10-26 2021-01-22 重庆邮电大学 Method for generating scene space relation information layout in scene conversion
CN116432623A (en) * 2023-04-14 2023-07-14 嘉兴九州文化传媒有限公司 Film and television shooting information management method for digital analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7016828B1 (en) * 2000-10-23 2006-03-21 At&T Corp. Text-to-scene conversion
US7664313B1 (en) * 2000-10-23 2010-02-16 At&T Intellectual Property Ii, L.P. Text-to scene conversion
US20120330869A1 (en) * 2011-06-25 2012-12-27 Jayson Theordore Durham Mental Model Elicitation Device (MMED) Methods and Apparatus
CN105069716A (en) * 2015-07-28 2015-11-18 史喻 Sight spot information push method satisfying user consultation
CN107220321A (en) * 2017-05-19 2017-09-29 重庆邮电大学 Solid threedimensional embodies in a kind of literary scape conversion method and its system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7016828B1 (en) * 2000-10-23 2006-03-21 At&T Corp. Text-to-scene conversion
US7664313B1 (en) * 2000-10-23 2010-02-16 At&T Intellectual Property Ii, L.P. Text-to scene conversion
US20120330869A1 (en) * 2011-06-25 2012-12-27 Jayson Theordore Durham Mental Model Elicitation Device (MMED) Methods and Apparatus
CN105069716A (en) * 2015-07-28 2015-11-18 史喻 Sight spot information push method satisfying user consultation
CN107220321A (en) * 2017-05-19 2017-09-29 重庆邮电大学 Solid threedimensional embodies in a kind of literary scape conversion method and its system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FUPING YANG: "Preliminary Implementation of Text-to-scene System", 《INTERNATIONAL CONFERENCE ON INFORMATION SCIENCES, MACHINERY, MATERIALS AND ENERGY2015》 *
FUPING YANG: "Scene Layout in Text-to-Scene Conversion", 《ICSAI 2014》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110688483A (en) * 2019-09-16 2020-01-14 重庆邮电大学 Dictionary-based noun visibility labeling method, medium and system in context conversion
CN110688483B (en) * 2019-09-16 2022-10-18 重庆邮电大学 Dictionary-based noun visibility labeling method, medium and system in context conversion
CN111310444A (en) * 2020-01-16 2020-06-19 北京大学 Park landscape service identification method
CN111814538A (en) * 2020-05-25 2020-10-23 北京达佳互联信息技术有限公司 Target object type identification method and device, electronic equipment and storage medium
CN111814538B (en) * 2020-05-25 2024-03-05 北京达佳互联信息技术有限公司 Method and device for identifying category of target object, electronic equipment and storage medium
CN112257386A (en) * 2020-10-26 2021-01-22 重庆邮电大学 Method for generating scene space relation information layout in scene conversion
CN112257386B (en) * 2020-10-26 2023-09-26 重庆邮电大学 Method for generating scene space relation information layout in text-to-scene conversion
CN116432623A (en) * 2023-04-14 2023-07-14 嘉兴九州文化传媒有限公司 Film and television shooting information management method for digital analysis
CN116432623B (en) * 2023-04-14 2023-09-22 嘉兴九州文化传媒有限公司 Film and television shooting information management method for digital analysis

Also Published As

Publication number Publication date
CN108108482B (en) 2022-02-11

Similar Documents

Publication Publication Date Title
CN108108482A (en) A kind of method that the enhancing of scene authenticity is realized in text scape conversion
Omar et al. Multi-label arabic text classification in online social networks
CN108052593B (en) Topic keyword extraction method based on topic word vector and network structure
Li et al. Data sets: Word embeddings learned from tweets and general data
CN104636402B (en) A kind of classification of business object, search, method for pushing and system
CN103678670B (en) Micro-blog hot word and hot topic mining system and method
CN106407169B (en) A kind of document marking method based on topic model
CN103116637A (en) Text sentiment classification method facing Chinese Web comments
CN108920482B (en) Microblog short text classification method based on lexical chain feature extension and LDA (latent Dirichlet Allocation) model
CN110287329B (en) E-commerce category attribute mining method based on commodity text classification
CN104484343A (en) Topic detection and tracking method for microblog
CN110598219A (en) Emotion analysis method for broad-bean-net movie comment
CN109214454B (en) Microblog-oriented emotion community classification method
CN107273913A (en) A kind of short text similarity calculating method based on multi-feature fusion
CN103092966A (en) Vocabulary mining method and device
CN112328794B (en) Typhoon event information aggregation method
CN107832307B (en) Chinese word segmentation method based on undirected graph and single-layer neural network
Samonte et al. Sentiment and opinion analysis on Twitter about local airlines
CN106777193A (en) A kind of method for writing specific contribution automatically
Peterlin et al. Automated content analysis: The review of the big data systemic discourse in tourism and hospitality
CN105354184A (en) Method for using optimized vector space model to automatically classify document
CN110019820A (en) Main suit and present illness history symptom Timing Coincidence Detection method in a kind of case history
CN107122465A (en) The construction method and system of a kind of Tibetan language sentiment dictionary based on Tibetan language language feature
Prayoga et al. Unsupervised Twitter Sentiment Analysis on The Revision of Indonesian Code Law and the Anti-Corruption Law using Combination Method of Lexicon Based and Agglomerative Hierarchical Clustering
Lin et al. Digital library information integration system based on big data and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant