CN105069809B

CN105069809B - A kind of camera localization method and system based on planar hybrid marker

Info

Publication number: CN105069809B
Application number: CN201510547761.3A
Authority: CN
Inventors: 吴毅红; 雷娟
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2015-08-31
Filing date: 2015-08-31
Publication date: 2017-10-03
Anticipated expiration: 2035-08-31
Also published as: CN105069809A

Abstract

The invention provides a kind of camera localization method and system based on planar hybrid marker, method includes off-line phase and on-line stage two parts.Off-line phase, extracts the characteristic point of mark object image, and builds bag of words；On-line stage, carries out characteristic matching using the bag of words of image in the bag of words and image data base of structure, the image with marker images match in image data base is found, so as to obtain position and the posture of camera.The present invention carries out camera positioning using planar hybrid marker, has the advantage based on manual identification's thing quick detection and based on natural image marker track smoothing concurrently, improves the stability and real-time of positioning.

Description

A kind of camera localization method and system based on planar hybrid marker

Technical field

The invention belongs to computer vision field, and in particular to a kind of camera localization method based on planar hybrid marker And system.

Background technology

Plane mark thing is a class by engineer, and the planar object of vision positioning is carried out for auxiliary camera.Due to it The advantages of easy to use and quick detection process robust, the camera alignment system based on plane mark thing is current using most wide One of mobile camera alignment system of general view-based access control model.Plane mark thing mainly includes manual identification's thing and natural image is identified The class of thing two.

Manual identification's thing is widely used in the camera positioning scene for lacking image feature information.Meanwhile, some cameras are determined Position system also carries out phase using artificial marker mostly for simplifying feature detection and reducing the purpose influenceed on scene structure Machine is positioned.Camera alignment system of the first set based on manual identification's thing is the teleconference developed by University of Washington in 1999 System.The user of the system can use virtual blank progress cooperation exchange in a space.Vienna science and technology in 2003 University devises a set of interior of building navigation system based on manual identification's thing, and the system is first to use on the mobile apparatus The camera alignment system of manual identification's thing independent operating.At present, the camera alignment system based on manual identification's thing is already present in Many application scenarios such as educational training, manufacture maintenance, commercial entertainment, navigation.Although manual identification's thing is in strengthening system In be widely used, but have to scene there are still such as it is invasive, can not process part block and positioning result not Enough smooth the problems such as.

It is different from the camera localization method based on manual identification's thing, the camera localization method point of use based on natural image, Side, Texture eigenvalue, are identified to marker and camera are positioned.In actual environment, available for camera positioning Plane mark thing be seen everywhere, such as books, audio-visual product front cover, photography and paint, product packaging and advertising poster Etc..Therefore, the mobile camera localization method based on plane natural image just obtains extensive concern from being born certainly.Phase Than in manual identification's thing, plane natural image texture information enriches, and is more beneficial for the robustness and precision of lifting positioning, but sharp The method being tracked with physical feature is there is also feature extraction and matching is computationally intensive, and the realization on mobile platform is difficult reality When the shortcomings of.

The content of the invention

(1) technical problem to be solved

It is an object of the present invention to provide a kind of camera localization method and system based on planar hybrid marker, can be carried The stability and real-time of high camera positioning.

(2) technical scheme

The present invention provides a kind of camera localization method based on planar hybrid marker, and planar hybrid marker includes mark Object image and the frame for being centered around the marker image peripheral, method include：

S1, off-line phase extracts the characteristic point of mark object image, and characteristic point is classified, and it is special to count each class The frequency of occurrences a little is levied, corresponding bag of words are obtained；

S2, on-line stage, camera is when camera plane mixes marker, the mark object image in detection frame, using word The bag of words of bag model and image in image data base carry out characteristic matching, find in image data base with identifying object image The image matched somebody with somebody, so as to obtain position and the posture of camera.

The present invention also provides a kind of camera alignment system based on planar hybrid marker, and planar hybrid marker includes mark Know object image and be centered around the frame of the marker image peripheral, the system includes：

Off-line equipment, for extracting the characteristic point of mark object image, and characteristic point is classified, and it is special to count each class The frequency of occurrences a little is levied, corresponding bag of words are obtained；

On-line equipment, for camera when camera plane mixes marker, the mark object image in detection frame, using word The bag of words of bag model and image in image data base carry out characteristic matching, find in image data base with identifying object image The image matched somebody with somebody, so as to obtain position and the posture of camera.

(3) beneficial effect

Camera localization method and system that the present invention is provided, carry out camera positioning using planar hybrid marker, have base concurrently In manual identification's thing quick detection and advantage based on natural image marker track smoothing, the stability and reality of positioning are improved Shi Xing.

Brief description of the drawings

Fig. 1 is the flow chart of the camera localization method provided in an embodiment of the present invention based on planar hybrid marker.

Fig. 2 is the schematic diagram of planar hybrid marker provided in an embodiment of the present invention.

Fig. 3 is the flow chart of off-line phase in camera localization method provided in an embodiment of the present invention.

Fig. 4 is the design sketch of acquisition mark object image in the embodiment of the present invention.

Fig. 5 is the design sketch of progress Epipolar geometry verification in the embodiment of the present invention.

Fig. 6 is the schematic diagram of the camera alignment system provided in an embodiment of the present invention based on planar hybrid marker.

Fig. 7 is the design sketch that the present embodiment carries out augmented reality on a mobile platform.

Embodiment

The present invention provides a kind of camera localization method and system based on planar hybrid marker, extracts and marks in off-line phase Know the characteristic point of object image, and build bag of words；In on-line stage, scheme using in the bag of words and image data base of structure The bag of words of picture carry out characteristic matching, the image with marker images match in image data base are found, so as to obtain camera Position and posture.

According to one embodiment of the present invention, planar hybrid marker includes quadrangle and identifies object image and be centered around this The frame of marker image peripheral, its frame can be dark square frame, to be defined to mark object image, therefore pass through Binaryzation simultaneously carries out adaptive threshold fuzziness and just can obtain stable rapidly testing result.Simultaneously because being added in dark border Natural image it is relevant with enhanced dummy object, therefore can intuitively point out user's marker will enhanced content.

According to one embodiment of the present invention, camera localization method includes：

According to one embodiment of the present invention, step S1 includes：

S11, extracts the characteristic point of mark object image, and sets up SURF description of characteristic point；

S12, clusters to characteristic point according to description the distance between son, obtains vocabulary, wherein, clustering method can be with Using level k-means methods etc., vocabulary includes multiple images feature word, one feature of each characteristics of image word correspondence The classification of point；

S13, counts the frequency of occurrences of each category feature point, obtains bag of words, wherein, bag of words include characteristic point and The set of son and the frequency histogram of vocabulary are described, the present invention is using bag of words to the potential marker area in frame of video Domain carries out quick-searching in database, so as to avoid the process of linear search.

According to one embodiment of the present invention, step S2 includes：

S21, initial phase, camera, by contour detecting, obtains the mark in frame when camera plane mixes marker Know object image, extract characteristic point and description in mark object image, obtain bag of words, and by the bag of words and picture number According in storehouse image bag of words carry out frequency histogram comparison, obtain match image, calculate mark object image with match image it Between homograph, obtain spin matrix and translation vector and optimize, so as to obtain the initial position and posture of camera；By Dark border is provided with around mark object image, the generation of error hiding can be greatly reduced in detection mark object image, The precision of Feature Points Matching is improved, compared with the method tracked using natural image, the present invention has faster resume speed.

S22, interframe tracking phase, camera uses the spy of t-1 frame identification object images when the frame of video of shooting is t frames Levy a little as tracking characteristics point, the characteristic point and its corresponding bag of words in t frames are obtained using pyramid optical flow approach, then The bag of words and image bag of words in image data base are carried out into frequency histogram to be compared, the spin matrix of t frames is obtained With translation vector and optimize, so as to obtain position and the posture of the t frames of camera.

S23, relocation phase, when interframe tracking quality is less than certain threshold value, such as threshold value can be taken as 0.1, then carry out side Frame is detected, extracts the characteristic point and corresponding bag of words in the mark object image in frame, and by the bag of words and initially The bag of words that the change stage detects are compared, if matching rate is too low, regard the mark object image as new marker Image, into initial phase.

According to one embodiment of the present invention, step S21 includes：

S211, detects the image border of frame of video, and makes continuous edge using pixel expansion operation, obtains multiple profiles, Polygonal approximation is carried out to all profiles, multiple enclosed regions are obtained, when enclosed region area and video frame images area ratio During more than a threshold value, then the enclosed region is regard as marker image candidate region.

S212, quadrilateral area is corrected to by four, marker image candidate region summit, obtains correction chart picture, extracts school Feature in positive image simultaneously counts the normalized frequency histogram that each vocabulary occurs in vocabulary, obtains corresponding bag of words mould The bag of words and image bag of words in image data base are carried out frequency histogram and compared, selection frequency histogram is most by type Similar preceding k marker as candidate, by k candidate identification thing be normalized into correction chart as after size between correction chart picture Carrying out characteristic matching and list should estimate, singly to answer point rate highest candidate identification thing in model to be used as the correction image recognition knot Really；

S213, calculates matching between the characteristic point in marker image candidate region and the characteristic point in recognition result, And calculate the homograph between them and obtain spin matrix and translation vector so as to decomposing, and optimized using following formula, obtain To the initial position and posture of camera：

Wherein, K represents the intrinsic parameter (focal length and principal point information that include camera) of camera, and R represents spin matrix, and t is represented Translation vector, N represents to identify the characteristics of image quantity on object image, X_iRepresent i-th of characteristics of image on mark object image, d (m_i, K [R t] X_i) represent X_iRe-projection and picture point m_iBetween geometric distance, and use Levenberg-Marquardt Iteration optimization algorithms are solved.

The present invention also provides a kind of camera alignment system based on planar hybrid marker, including：

According to one embodiment of the present invention, off-line equipment extracts the characteristic point and SURF description of mark object image, And characteristic point is clustered according to the distance between description, vocabulary is obtained, wherein, vocabulary includes multiple images feature Word, the classification of one characteristic point of each characteristics of image word correspondence；The frequency of occurrences of each category feature point is counted, bag of words are obtained Model, wherein, bag of words include set and the frequency histogram of vocabulary of characteristic point and description.

According to one embodiment of the present invention, on-line equipment includes：

Initialization module, for camera when camera plane mixes marker, by contour detecting, obtains the mark in frame Know object image, extract characteristic point and description in mark object image, obtain bag of words, and by the bag of words and picture number According in storehouse image bag of words carry out frequency histogram comparison, obtain match image, calculate mark object image with match image it Between homograph, obtain spin matrix and translation vector and optimize, so as to obtain the initial position and posture of camera；

Interframe tracking module, for camera when the frame of video of shooting is t frames, uses the spy of t-1 frame identification object images Levy a little as tracking characteristics point, the characteristic point and its corresponding bag of words in t frames are obtained using pyramid optical flow approach, then The bag of words and image bag of words in image data base are carried out into frequency histogram to be compared, the spin matrix of t frames is obtained With translation vector and optimize, so as to obtain position and the posture of the t frames of camera；

Module is relocated, for when interframe tracking quality is less than certain threshold value, carrying out frame detection, extracting in frame Identify the characteristic point and corresponding bag of words in object image, and the bag of words mould that the bag of words and initial phase are detected Type is compared, if matching rate is too low, using the mark object image as new mark object image, into initial phase.

According to one embodiment of the present invention, initialization module includes：

Detection unit, the image border for detecting frame of video, and make continuous edge using pixel expansion operation, obtain many All profiles are carried out polygonal approximation, multiple enclosed regions are obtained, when enclosed region area and video frame images face by individual profile When the ratio between product is more than a threshold value, e.g., threshold value can be taken as 0.1, then regard the enclosed region as marker image candidate region.

Recognition unit, for four, marker image candidate region summit to be corrected into quadrilateral area, obtains correction chart Picture, extracts the feature on correction chart picture and counts the normalized frequency histogram that each vocabulary occurs in vocabulary, obtain corresponding Bag of words, by the bag of words and in image data base image bag of words carry out frequency histogram compared, select frequency The most like preceding k marker of histogram as candidate, by k candidate identification thing be normalized into correction chart as after size with correction Characteristic matching and list are carried out between image to be estimated, singly to answer point rate highest candidate identification thing in model to be used as the correction chart picture Recognition result；

Positioning unit, for calculating between the characteristic point in the characteristic point and recognition result in marker image candidate region Matching, and calculate the homograph between them and obtain spin matrix and translation vector so as to decomposing, and utilize following formula progress Optimization, obtains the initial position and posture of camera：

For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with specific embodiment, and reference Accompanying drawing, the present invention is described in more detail.

Fig. 1 is the flow chart of the camera localization method provided in an embodiment of the present invention based on planar hybrid marker, wherein, The planar hybrid marker is as shown in Fig. 2 planar hybrid marker includes quadrangle mark object image and is centered around the mark Frame around object image, its frame can be dark square frame, to be defined to mark object image, therefore pass through two-value Changing and carrying out adaptive threshold fuzziness just can obtain stable rapidly testing result.Simultaneously because dark border in addition from Right image is relevant with enhanced dummy object, therefore can intuitively point out user's marker will enhanced content.Such as Fig. 1 Shown, camera localization method includes：

S1, off-line phase extracts the characteristic point of mark object image, and characteristic point is classified, and it is special to count each class The frequency of occurrences a little is levied, corresponding bag of words are obtained.

Wherein, as shown in figure 3, being specifically included in off-line phase step S1：

S12, is clustered to characteristic point using level k-means methods according to the distance between description, obtains vocabulary Table, wherein, vocabulary includes multiple images feature word, the classification of one characteristic point of each characteristics of image word correspondence；

S13, counts the frequency of occurrences of each category feature point, obtains bag of words, wherein, bag of words include characteristic point and The set of son and the frequency histogram of vocabulary are described, the present embodiment is using bag of words to the potential marker in frame of video Region carries out quick-searching in database, so as to avoid the process of linear search.

S2, on-line stage, camera is when camera plane mixes marker, the mark object image in detection frame, using bag The bag of words of image carry out characteristic matching in model and image data base, find in image data base with marker images match Image, so as to obtain position and the posture of camera.

Wherein, on-line stage specifically may include initial phase, interframe tracking phase and relocation phase, wherein：

S21, initial phase, camera, by contour detecting, obtains the mark in frame when camera plane mixes marker Know object image, extract characteristic point and description in mark object image, obtain bag of words, and by the bag of words and picture number According in storehouse image bag of words carry out frequency histogram comparison, obtain match image, calculate mark object image with match image it Between homograph, obtain spin matrix and translation vector and optimize, so as to obtain the initial position and posture of camera；By Dark border is provided with around mark object image, the generation of error hiding can be greatly reduced in detection mark object image, The precision of Feature Points Matching is improved, compared with the method tracked using natural image, the present invention has faster resume speed.This Implementation is divided into detection, three processes of identification and positioning and initial phase is specifically described：

S211, as shown in figure 4, the image border of detection frame of video, and make continuous edge using pixel expansion operation, obtain All profiles are carried out polygonal approximation, multiple enclosed regions are obtained, when enclosed region area and video frame images by multiple profiles When area ratio is more than 0.1, then the enclosed region is regard as marker image candidate region.

S212, as shown in figure 5, four, marker image candidate region summit is corrected into quadrilateral area, is corrected Image, extracts the feature on correction chart picture and counts the normalized frequency histogram that each vocabulary occurs in vocabulary, obtain phase The bag of words and image bag of words in image data base are carried out frequency histogram and compared by the bag of words answered, selection frequency The most like preceding k marker of rate histogram as candidate, by k candidate identification thing be normalized into correction chart as after size with school Characteristic matching and list are carried out between positive image to be estimated, singly to answer point rate highest candidate identification thing in model to be used as the correction chart As recognition result.Fig. 5 show the preceding 3 width imagery exploitation Epipolar geometry for returning to query image and bag of words and carries out geometry school The result tested, the interior rate matched respectively with classification 3, classification 7, classification 8 is 44%, 17.6% and 12%, and classification 3 is most Whole Query Result, now k is 3.

As shown in fig. 6, the embodiment of the present invention also provides a kind of camera alignment system based on planar hybrid marker, bag Include：

Off-line equipment, for extracting the characteristic point of mark object image, and characteristic point is classified, and it is special to count each class The frequency of occurrences a little is levied, corresponding bag of words are obtained.

On-line equipment, it includes initialization module, interframe tracking module and reorientation module, wherein：

Initialization module includes detection unit, recognition unit and positioning unit, and detection unit is used for the figure for detecting frame of video Make continuous edge as edge, and using pixel expansion operation, obtain multiple profiles, polygonal approximation is carried out to all profiles, obtained To multiple enclosed regions, when enclosed region area and video frame images area ratio are more than 0.1, then using the enclosed region as Marker image candidate region；Recognition unit is used to four, marker image candidate region summit being corrected to quadrilateral area, Correction chart picture is obtained, the feature on correction chart picture is extracted and counts the normalized frequency Nogata that each vocabulary occurs in vocabulary Figure, obtains corresponding bag of words, and image bag of words in the bag of words and image data base are carried out into frequency histogram ratio Compared with k candidate identification thing is normalized into correction chart picture by the most like preceding k marker of selection frequency histogram as candidate Carrying out characteristic matching and list between correction chart picture after size should estimate, singly to answer point rate highest candidate identification thing in model to make For the correction image recognition result；Positioning unit is used to calculate in the characteristic point and recognition result in marker image candidate region Characteristic point between matching, and calculate the homograph between them and obtain spin matrix and translation vector so as to decomposing, and Optimized using following formula, obtain the initial position and posture of camera：

Interframe tracking module, for camera when the frame of video of shooting is t frames, uses the spy of t-1 frame identification object images Levy a little as tracking characteristics point, the characteristic point and its corresponding bag of words in t frames are obtained using pyramid optical flow approach, then The bag of words and image bag of words in image data base are carried out into frequency histogram to be compared, the spin matrix of t frames is obtained With translation vector and optimize, so as to obtain position and the posture of the t frames of camera.

The planar hybrid marker of the present embodiment is positioned in more mixed and disorderly Desktop-scene, occurred in marker larger The representative frame of video of superposition object is as shown in Figure 7 when rotation, dimensional variation, viewpoint change.Wherein (a)~(i) is directly to make The operational effect figure obtained with iPad screenshot capture.Figure (j) to scheme (l) be for more intuitively show mobile terminal tracking effect Really, with the operational effect figure of other viewing angles.It can be seen that wide-angle rotation occurs for marker in the picture Turn, when dimensional variation, this chapter mobile augmented realities system still can preferably carry out the superposition of dummy object.

For the mobile augmented reality application of plane mark thing, the present invention combines manual identification's thing and natural image marker Advantage, devise a kind of new planar hybrid marker, and a kind of camera positioning side is proposed based on such plane mark thing Method.Camera localization method based on such marker is reduced by way of frame is detected detects marker in the video frame Time, simultaneously because defining the region of image zooming-out characteristic point and description, irrelevant contents are also avoided to a certain extent Influence to search method, improves the accuracy rate of marker identification.Tested in the recognition efficiency of 10 markers with accuracy rate In it can be found that due to the presence of frame, algorithm can by picture material it is calibrated after recognize again, can tackle well larger Viewpoint, yardstick, rotation etc. change, and are identified while algorithm describes son using SURF, therefore also have certain Shandong to illumination Rod.Algorithm without geometry when verifying, and discrimination is 88%, and discrimination is up to 98% after addition geometry verification.It is mobile flat Experiment on platform shows that the rotation occurred during tracking, chi can be tackled well by carrying out interframe light stream using physical feature Degree and viewpoint change, locating speed is that 18 frames are per second, achieves real-time mobile augmented reality effect.

Particular embodiments described above, has been carried out further in detail to the purpose of the present invention, technical scheme and beneficial effect Describe in detail it is bright, should be understood that the foregoing is only the present invention specific embodiment, be not intended to limit the invention, it is all Within the spirit and principles in the present invention, any modification, equivalent substitution and improvements done etc., should be included in the guarantor of the present invention Within the scope of shield.

Claims

1. a kind of camera localization method based on planar hybrid marker, it is characterised in that the planar hybrid marker includes Quadrangle identifies object image and is centered around the frame of the marker image peripheral, and method includes：

S1, off-line phase extracts the characteristic point of the mark object image, and the characteristic point is classified, and counts each The frequency of occurrences of category feature point, obtains corresponding bag of words；

S2, on-line stage, camera detects the mark object image in the frame, adopted when shooting the planar hybrid marker Carry out characteristic matching with the bag of words of image in the bag of words and image data base, find in image data base with it is described The image of marker images match, so as to obtain position and the posture of camera.

2. camera localization method according to claim 1, it is characterised in that the step S1 includes：

S11, extracts the characteristic point of the mark object image, and sets up SURF description of the characteristic point；

S12, clusters to the characteristic point according to the distance between described description, obtains vocabulary, wherein, the vocabulary Table includes multiple images feature word, the classification of one characteristic point of each characteristics of image word correspondence；

S13, counts the frequency of occurrences of each category feature point, obtains bag of words, wherein, bag of words include the characteristic point and The set of son and the frequency histogram of the vocabulary are described.

3. camera localization method according to claim 2, it is characterised in that the step S2 includes：

S21, initial phase, camera, by contour detecting, is obtained in the frame when shooting the planar hybrid marker Mark object image, extract the characteristic point and description in the mark object image, obtain bag of words, and by the bag of words Frequency histogram is carried out with image bag of words in image data base to be compared, and is obtained matching image, is calculated the mark object image With matching the homograph between image, obtain spin matrix and translation vector and optimize, so as to obtain the initial of camera Position and posture；

S22, interframe tracking phase, camera uses the characteristic point of t-1 frame identification object images when the frame of video of shooting is t frames As tracking characteristics point, the characteristic point and its corresponding bag of words in t frames are obtained using pyramid optical flow approach, then should Bag of words carry out frequency histogram with image bag of words in image data base and compared, and obtain the spin matrix peace of t frames The amount of shifting to simultaneously is optimized, so as to obtain position and the posture of the t frames of camera；

S23, relocation phase, when interframe tracking quality is less than certain threshold value, carries out frame detection, extracted in the frame Identify the characteristic point and corresponding bag of words in object image, and the bag of words mould that the bag of words and initial phase are detected Type is compared, if matching rate is too low, using the mark object image as new mark object image, into initial phase.

4. camera localization method according to claim 3, it is characterised in that the step S21 includes：

S211, detects the image border of frame of video, and makes continuous edge using pixel expansion operation, multiple profiles is obtained, to institute There is profile to carry out polygonal approximation, multiple enclosed regions are obtained, when enclosed region area is more than with video frame images area ratio During one threshold value, then the enclosed region is regard as marker image candidate region；

Described four summits in marker image candidate region are corrected to quadrilateral area by S212, obtain correction chart picture, extract institute State the feature on correction chart picture and count the normalized frequency histogram that each vocabulary occurs in vocabulary, obtain corresponding bag of words The bag of words and image bag of words in image data base are carried out frequency histogram and compared, select frequency histogram by model Most like preceding k marker as candidate, by the k candidate identification thing be normalized into the correction chart as after size with institute Progress characteristic matching should be estimated with single between stating correction chart picture, singly to answer point rate highest candidate identification thing in model to be used as the school Positive image recognition result；

S213, calculates matching between the characteristic point in the marker image candidate region and the characteristic point in recognition result, And calculate the homograph between them and obtain spin matrix and translation vector so as to decomposing, and optimized using following formula, obtain To the initial position and posture of the camera：

<mrow> <mo>{</mo> <mi>K</mi> <mo>,</mo> <mi>R</mi> <mo>,</mo> <mi>t</mi> <mo>}</mo> <mo>=</mo> <mi>arg</mi> <munder> <mi>min</mi> <mrow> <mi>K</mi> <mo>,</mo> <mi>R</mi> <mo>,</mo> <mi>t</mi> </mrow> </munder> <munder> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>...</mn> <mo>,</mo> <mi>N</mi> </mrow> </munder> <mi>d</mi> <msup> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>K</mi> <mo>&lsqb;</mo> <mi>R</mi> <mi> </mi> <mi>t</mi> <mo>&rsqb;</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow>

Wherein, K represents the intrinsic parameter of camera, and the intrinsic parameter includes the focal length and principal point information of camera, and R represents spin matrix, t tables Show translation vector, N represents to identify the characteristics of image quantity on object image, X_iI-th of characteristics of image on mark object image is represented, d(m_i, K [R t] X_i) represent X_iRe-projection and picture point m_iBetween geometric distance, and use Levenberg-Marquardt Iteration optimization algorithms are solved.

5. camera localization method according to claim 4, it is characterised in that the threshold value is 0.1.

6. a kind of camera alignment system based on planar hybrid marker, it is characterised in that the planar hybrid marker includes Quadrangle identifies object image and is centered around the frame of the marker image peripheral, and the system includes：

Off-line equipment, for extracting the characteristic point of the mark object image, and the characteristic point is classified, and count each The frequency of occurrences of category feature point, obtains corresponding bag of words；

On-line equipment, for camera when shooting the planar hybrid marker, detects the mark object image in the frame, adopts Carry out characteristic matching with the bag of words of image in the bag of words and image data base, find in image data base with it is described The image of marker images match, so as to obtain position and the posture of camera.

7. camera alignment system according to claim 6, it is characterised in that the off-line equipment extracts the marker figure The characteristic point of picture, and SURF description of the characteristic point are set up, according to the distance between described description to the characteristic point Clustered, obtain vocabulary, wherein, the vocabulary includes multiple images feature word, each characteristics of image word correspondence The classification of one characteristic point；The frequency of occurrences of each category feature point is counted, bag of words are obtained, wherein, bag of words include institute State set and the frequency histogram of the vocabulary of characteristic point and description.

8. camera alignment system according to claim 7, it is characterised in that the on-line equipment includes：

Initialization module, for camera when shooting the planar hybrid marker, by contour detecting, is obtained in the frame Mark object image, extract the characteristic point and description in the mark object image, obtain bag of words, and by the bag of words Frequency histogram is carried out with image bag of words in image data base to be compared, and is obtained matching image, is calculated the mark object image With matching the homograph between image, obtain spin matrix and translation vector and optimize, so as to obtain the initial of camera Position and posture；

Interframe tracking module, for camera when the frame of video of shooting is t frames, uses the characteristic point of t-1 frame identification object images As tracking characteristics point, the characteristic point and its corresponding bag of words in t frames are obtained using pyramid optical flow approach, then should Bag of words carry out frequency histogram with image bag of words in image data base and compared, and obtain the spin matrix peace of t frames The amount of shifting to simultaneously is optimized, so as to obtain position and the posture of the t frames of camera；

Module is relocated, for when interframe tracking quality is less than certain threshold value, carrying out frame detection, extracting in the frame Identify the characteristic point and corresponding bag of words in object image, and the bag of words mould that the bag of words and initial phase are detected Type is compared, if matching rate is too low, using the mark object image as new mark object image, into initial phase.

9. camera alignment system according to claim 8, it is characterised in that the initialization module includes：

Detection unit, the image border for detecting frame of video, and make continuous edge using pixel expansion operation, obtain multiple wheels All profiles are carried out polygonal approximation, obtain multiple enclosed regions by exterior feature, when enclosed region area and video frame images area it During than more than a threshold value, then the enclosed region is regard as marker image candidate region；

Recognition unit, for described four summits in marker image candidate region to be corrected into quadrilateral area, obtains correction chart Picture, extracts the feature on the correction chart picture and counts the normalized frequency histogram that each vocabulary occurs in vocabulary, obtain The bag of words and image bag of words in image data base are carried out frequency histogram and compared, selected by corresponding bag of words The k candidate identification thing is normalized into the correction chart picture by the most like preceding k marker of frequency histogram as candidate Carrying out characteristic matching and list between the correction chart picture after size should estimate, singly to answer point rate highest candidate identification in model Thing is used as the correction image recognition result；

Positioning unit, for calculating between the characteristic point in the characteristic point and recognition result in the marker image candidate region Matching, and calculate the homograph between them and obtain spin matrix and translation vector so as to decomposing, and utilize following formula progress Optimization, obtains the initial position and posture of the camera：

10. camera alignment system according to claim 9, it is characterised in that the threshold value is 0.1.