CN108763570A - A kind of method and device identifying the identical source of houses - Google Patents

A kind of method and device identifying the identical source of houses Download PDF

Info

Publication number
CN108763570A
CN108763570A CN201810570338.9A CN201810570338A CN108763570A CN 108763570 A CN108763570 A CN 108763570A CN 201810570338 A CN201810570338 A CN 201810570338A CN 108763570 A CN108763570 A CN 108763570A
Authority
CN
China
Prior art keywords
houses
source
pictures
characteristic value
picture set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810570338.9A
Other languages
Chinese (zh)
Inventor
周福涛
卢喜亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING TUOSHI HUANYU NETWORK TECHNOLOGY Co Ltd
Original Assignee
BEIJING TUOSHI HUANYU NETWORK TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING TUOSHI HUANYU NETWORK TECHNOLOGY Co Ltd filed Critical BEIJING TUOSHI HUANYU NETWORK TECHNOLOGY Co Ltd
Priority to CN201810570338.9A priority Critical patent/CN108763570A/en
Publication of CN108763570A publication Critical patent/CN108763570A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of method and devices of the identical source of houses of identification, including:Obtain the characteristic value per pictures in characteristic value and the second picture set in the first picture set per pictures, first picture set includes picture associated with first source of houses, second picture set includes picture associated with second source of houses, then, it can determine the quantity of similar pictures in the first picture set, similarity degree in the characteristic value and second picture set of every similar pictures at least between the characteristic value of a pictures reaches preset condition, if the quantity of similar pictures is more than preset first threshold, it can then determine that first source of houses and second source of houses are the identical source of houses.It can be seen that, similar pictures quantity in association picture by determining two sources of houses, it may determine that whether two sources of houses are identical, duplicate removal can be carried out to the source of houses on source of houses website in this way, so that there is no the sources of houses with identical information of real estate in source of houses website, and then improve the usage experience of user.

Description

A kind of method and device identifying the identical source of houses
Technical field
The present invention relates to identification technology fields, more particularly to a kind of method and device of the identical source of houses of identification.
Background technology
Currently, more and more users' selection searches for oneself on source of houses website needs the source of houses bought or leased.And House property broker (hereinafter referred to as broker) publicizes its source of houses grasped to increase, and is often issued on source of houses website big The information of real estate of amount, also, different brokers may issue the information of real estate of the same source of houses on the source of houses website, in addition it is same One broker can may also repeatedly issue the information of real estate of the same source of houses on the source of houses website.In this way, searching for the source of houses in user May include the source of houses largely with identical information of real estate when information, in the search result that source of houses website is presented to user, from And reduce the usage experience that user searches for the source of houses on the source of houses website.
Invention content
Present invention solves the technical problem that being to provide a kind of method and device identifying the identical source of houses, with duplicate removal source of houses net The source of houses with identical information of real estate, the usage experience of the source of houses is searched for improve user on source of houses website on standing.
For this purpose, in a first aspect, an embodiment of the present invention provides a kind of method of the identical source of houses of identification, this method includes:
The characteristic value per pictures in characteristic value and the second picture set in the first picture set per pictures is obtained, The first picture set includes picture associated with first source of houses, and the second picture set includes related to second source of houses The picture of connection;
Determine the quantity of similar pictures in the first picture set, the characteristic value of the similar pictures and second figure Similarity degree in piece set at least between the characteristic value of a pictures reaches preset condition;
If the quantity of the similar pictures is more than preset first threshold, it is determined that first source of houses and second room Source is the identical source of houses.
In some possible embodiments, the characteristic value of the similar pictures and in the second picture set at least one Similarity degree between the characteristic value of pictures reaches preset condition, including:
Sea in the characteristic value of the similar pictures and the second picture set at least between the characteristic value of a pictures Prescribed distance is less than preset second threshold.
In some possible embodiments, the characteristic value and second obtained in the first picture set per pictures Characteristic value in picture set per pictures, including:
According to the network address per pictures in the first picture set, the corresponding picture of the network address is downloaded;
The characteristic value of download pictures is calculated in the first picture set;
From the characteristic value read in information of real estate library in the second picture set per pictures.
In some possible embodiments, the method further includes:
It, then will be every in the first picture set when first source of houses is not the identical source of houses with second source of houses The characteristic value of picture is added in the information of real estate library.
In some possible embodiments, the characteristic value and second obtained in the first picture set per pictures Characteristic value in picture set per pictures, including:
From the characteristic value and the second picture read in information of real estate library in the first picture set per pictures Characteristic value in set per pictures.
In some possible embodiments, the method further includes:
Obtain the mark of first source of houses;
According to the mark, second source of houses is determined from information of real estate library.
In some possible embodiments, the method further includes:
Obtain the character description information of first source of houses and the character description information of second source of houses;
If the quantity of the similar pictures is more than preset first threshold, it is determined that first source of houses and described the Two sources of houses are the identical source of houses, including:
If the quantity of the similar pictures be more than preset first threshold, and the character description information of first source of houses with The character description information of second source of houses is identical, it is determined that first source of houses is the identical source of houses with second source of houses.
Second aspect, the embodiment of the present invention additionally provide a kind of device identifying the identical source of houses, which includes:
Acquiring unit, for obtaining in characteristic value and the second picture set in the first picture set per pictures every The characteristic value of picture, the first picture set include picture associated with first source of houses, and the second picture set includes Picture associated with second source of houses;
First determination unit, the quantity for determining similar pictures in the first picture set, the similar pictures Similarity degree in characteristic value and the second picture set at least between the characteristic value of a pictures reaches preset condition;
Second determination unit, if the quantity for the similar pictures is more than preset first threshold, it is determined that described the One source of houses is the identical source of houses with second source of houses.
In some possible embodiments, the characteristic value of the similar pictures and in the second picture set at least one Similarity degree between the characteristic value of pictures reaches preset condition, including:
Sea in the characteristic value of the similar pictures and the second picture set at least between the characteristic value of a pictures Prescribed distance is less than preset second threshold.
In some possible embodiments, the acquiring unit, including:
Lower subelements, for according to the network address per pictures in the first picture set, downloading the network address and corresponding to Picture;
Computation subunit, for the characteristic value of download pictures to be calculated in the first picture set;
Reading subunit, for from read in information of real estate library in the second picture set per pictures characteristic value.
In some possible embodiments, described device further includes:
Adding device is used for when first source of houses is not the identical source of houses with second source of houses, then by described first Characteristic value in picture set per pictures is added in the information of real estate library.
In some possible embodiments, the acquiring unit, specifically for reading described from information of real estate library Characteristic value in characteristic value and the second picture set in one picture set per pictures per pictures.
In some possible embodiments, described device further includes:
Mark acquiring unit, the mark for obtaining first source of houses;
Third determination unit, for according to the mark, second source of houses to be determined from information of real estate library.
In some possible embodiments, described device further includes:
Information acquisition unit, the verbal description letter of character description information and second source of houses for obtaining first source of houses Breath;
Second determination unit, if the quantity specifically for the similar pictures is more than preset first threshold, and institute The character description information for stating first source of houses is identical as the character description information of second source of houses, it is determined that first source of houses with Second source of houses is the identical source of houses.
According to the above-mentioned technical solution, the method have the advantages that:
In the embodiment of the present invention, characteristic value and second picture set per pictures in the first picture set can be obtained In per pictures characteristic value, wherein the first picture set includes picture associated with first source of houses, second picture set packet Picture associated with second source of houses is included, it is then possible to determine the quantity of similar pictures in the first picture set, wherein every Similarity degree in the characteristic value and second picture set of similar pictures at least between the characteristic value of a pictures reaches default Condition can determine that first source of houses and second source of houses are identical if the quantity of similar pictures is more than preset first threshold The source of houses.As it can be seen that if two sources of houses are the identical source of houses, the quantity of similar pictures in picture associated with the two sources of houses Can be very much, therefore, similar pictures quantity in the association picture by determining two sources of houses, it can be determined that whether go out two sources of houses For the identical source of houses, in this way, it may be determined that whether broker issues the information of real estate of a certain source of houses, had existed on source of houses website The information of real estate of the identical source of houses, or can be used for clearing up the information of real estate of the identical source of houses issued on source of houses website, from And the source of houses with identical information of real estate on duplicate removal source of houses website may be implemented, so that when user searches on the source of houses website When the rope source of houses, usually there is no the source of houses with identical information of real estate, users in the search result which is presented The information of real estate of more different sources of houses can be viewed on a display interface, and then improves user in the source of houses website The usage experience of the upper search source of houses.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Obtain other attached drawings according to these attached drawings.
Fig. 1 is a kind of exemplary scenario schematic diagram in the embodiment of the present invention;
Fig. 2 is a kind of method flow schematic diagram identifying the identical source of houses in the embodiment of the present invention;
Fig. 3 is another method flow schematic diagram for identifying the identical source of houses in the embodiment of the present invention;
Fig. 4 is a kind of apparatus structure schematic diagram identifying the identical source of houses in the embodiment of the present invention.
Specific implementation mode
In order to provide the implementation for improving the usage experience that user searches for the source of houses on source of houses website, the embodiment of the present invention A kind of method and device identifying the identical source of houses is provided, the section Example of the present invention is carried out below in conjunction with Figure of description Explanation, it should be understood that preferred embodiments described herein are only used to illustrate and explain the present invention, is not used to limit this hair It is bright.And in the absence of conflict, the features in the embodiments and the embodiments of the present application can be combined with each other.
Inventor it has been investigated that, broker would generally issue a large amount of information of real estate on source of houses website, include not only There is a pictorial information of the source of houses, such as the indoor picture of the source of houses, outdoor picture, house type picture, further includes that the word of the source of houses is retouched Floor, window direction, construction area where stating information, such as the source of houses.For the same source of houses, different brokers may be same The information of real estate of the source of houses is issued on one source of houses website, even same broker may repeatedly publication should on the source of houses website The information of real estate of the source of houses, which results in the informations of real estate that there are a large amount of identical sources of houses on the source of houses website.When user is in the room May exist largely when the enterprising having sexual intercourse source in source website is searched for, in the search result that source of houses website is presented has identical source of houses letter The source of houses of breath, this reduces usage experience of the user on the source of houses website.For example, search is being presented to user in source of houses website May be entirely the information of real estate of the same source of houses when as a result, on the display page, that is, a display page is actually only to user Provide the information of real estate of a source of houses, and user then needs to execute such as if it is intended to obtain the information of real estate of other sources of houses Other operations such as " lower one page " are clicked to obtain the information of real estate of more sources of houses, user is not only reduced in this way and obtains information of real estate Efficiency, moreover, the usage experience of user is also very poor.
To solve the above-mentioned problems, an embodiment of the present invention provides a kind of methods of the identical source of houses of identification, to reject the source of houses The source of houses with identical information of real estate, the usage experience of the source of houses is searched for improve user on source of houses website on website.Tool Body, the feature per pictures in characteristic value and the second picture set in the first picture set per pictures can be obtained Value, wherein the first picture set includes picture associated with first source of houses, and second picture set includes related to second source of houses The picture of connection, it is then possible to determine the quantity of similar pictures in the first picture set, wherein the feature of every similar pictures Similarity degree in value and second picture set at least between the characteristic value of a pictures reaches preset condition, if similar pictures Quantity be more than preset first threshold, then can determine that first source of houses and second source of houses are the identical source of houses.
As it can be seen that if two sources of houses are the identical source of houses, the number of similar pictures in picture associated with the two sources of houses Amount also can be very much, therefore, similar pictures quantity in the association picture by determining two sources of houses, it can be determined that go out two sources of houses Whether it is the identical source of houses, in this way, it may be determined that whether broker issues the information of real estate of a certain source of houses, on source of houses website There are the informations of real estate of the identical source of houses, or can be used for clearing up the source of houses letter of the identical source of houses issued on source of houses website Breath, so as to realize the source of houses with identical information of real estate on duplicate removal source of houses website, so that when user is in the source of houses net When searching for the source of houses on standing, usually there is no the rooms with identical information of real estate in the search result which is presented Source, user can view the information of real estate of more different sources of houses on a display interface, and then improve user at this The usage experience of the source of houses is searched on source of houses website.
For example, the embodiment of the present invention can be applied in exemplary scenario as described in Figure 1.In this scenario, it takes Whether business device 102 can be that the identical source of houses judges to two sources of houses.Specifically, when broker wants to exist by terminal 101 When issuing the information of real estate of the pending source of houses on source of houses website, terminal 101 can be given birth to based on the information of real estate of the pending source of houses It is asked at identification, identification request includes picture associated with the band audit source of houses, and then, terminal 101 can be by the identification Request is sent to server 102;Server 102 responds the identification request received, can be asked to determine this according to the identification Belong to same category of with the pending source of houses on source of houses website and refer to the source of houses, then, server 102 can be obtained refers to room with this The associated picture in source, and determine the quantity of similar pictures in picture associated with the pending source of houses, the similar pictures Characteristic value and reach default item with reference to the similarity degree in the associated picture of the source of houses at least between the characteristic value of a pictures Part, if the quantity of the similar pictures is more than preset first threshold, server 102 can determine the pending source of houses and reference The source of houses belongs to the identical source of houses, and the identification for existing on source of houses website and having the identical source of houses with the pending source of houses is returned to terminal 101 As a result, so that terminal 101 is according to the recognition result, refusal broker issues the room of the pending source of houses on the source of houses website Source information;If the quantity of the similar pictures is not more than preset first threshold, server 102 can determine the pending source of houses Belong to the different sources of houses from reference to the source of houses, and returns to be not present to have with the pending source of houses on source of houses website to terminal 101 and mutually have sexual intercourse The recognition result in source, so that terminal 101 allows broker to issue this on the source of houses website pending according to the recognition result The information of real estate of the source of houses.
It is appreciated that above-mentioned scene only for illustration, is not intended to limit the present invention the application scenarios of embodiment, In fact, the embodiment of the present invention can also be applied to other scenes, for example, can on to source of houses website the already present source of houses into Row duplicate checking, rejecting the source of houses with identical information of real estate, the duplicate checking processing procedure on source of houses website can carry out on the server Implement etc..
Referring to Fig.2, Fig. 2 shows a kind of method flow schematic diagram of the identical source of houses of identification in the embodiment of the present invention, the party Method can specifically include:
S201:Obtain the spy per pictures in characteristic value and the second picture set in the first picture set per pictures Value indicative, the first picture set include picture associated with first source of houses, and second picture set includes related to second source of houses The picture of connection.
It is appreciated that the room for having issued the source of houses on the information of real estate either source of houses website of broker's source of houses to be released Source information usually can all issue picture associated with the source of houses, for example, can be source of houses parlor, master bedroom, secondary room, the storing The indoor pictures such as room can also be the outdoor picture such as the source of houses external form, place cell, can also be and characterize the source of houses layout structure House type picture etc..Then, these can have been issued or picture to be released, as picture associated with the source of houses, into And the first picture set can be obtained according to picture associated with first source of houses, according to figure associated with second source of houses Piece can obtain second picture set.
As a kind of exemplary specific implementation obtaining characteristic value, can information of real estate library be locally located in advance, The characteristic value per pictures associated with the source of houses is stored in the information of real estate library, in this way, when needing to obtain the first pictures In conjunction and second picture set when the characteristic value of every pictures, can directly it be read out from information of real estate library.
For example, server can be directed to each source of houses on source of houses website, it in advance will figure associated with the source of houses The characteristic value of piece is stored in information of real estate library, when needing to announced source of houses progress duplicate checking on source of houses website, if really Surely it is that first source of houses and second source of houses are identified, then can be read directly from information of real estate library related to first source of houses The characteristic value of the picture of connection, and picture associated with second source of houses characteristic value.
As another exemplary embodiments for obtaining characteristic value, the feature of every pictures in the first picture set Characteristic value in value and second picture set per pictures, can be acquired by different paths.Specifically, can be with According to the network address per pictures in the first picture set, picture corresponding with the network address is downloaded from the Internet, then by downloading Every pictures calculated, obtain in the first picture set the characteristic value of download pictures;And it is possible to from advance in local The characteristic value in second picture set per pictures is read in the information of real estate library of setting, is stored in the information of real estate library and the The associated characteristic value per pictures of two sources of houses.
For example, when the information of real estate of server first source of houses to be released to broker is audited, can obtain with The associated URL (Uniform Resource Locator, uniform resource locator) per pictures of first source of houses, then root The corresponding pictures of the URL are downloaded from the Internet according to the URL, then, pass through OpenCV (Open Source Computer Vision Library, computer vision of increasing income library) picture downloaded is calculated, obtain the characteristic value of every pictures;And for Announced second source of houses on source of houses website, can be from directly from the reading and second in the information of real estate library being locally located in advance The associated characteristic value per pictures of the source of houses.
Further, when first source of houses is the different sources of houses from second source of houses, due to picture associated with first source of houses With with second source of houses usually there is larger difference in associated picture, and therefore, there is also larger differences between the characteristic value of picture It is different, in order to save computing resource, the characteristic value per pictures in the first picture set the source of houses can be added in practical application In information bank, in this way, when needing in the first picture set the characteristic value per pictures again, so that it may not have to download the again Every pictures in one picture set less use the characteristic value for calculating every pictures, and can be directly from information of real estate library It is read out, the repeated downloads of picture associated with first source of houses not only can be avoided, and to the picture Characteristic value computes repeatedly, furthermore, it is also possible to shorten the acquisition time of the picture feature value, so as to shorten the response time.
Certainly, the exemplary embodiments that characteristic value is obtained in conjunction with above two, in another acquisition characteristic value In exemplary embodiments, can also be in the pre-set information of real estate library of inquiry with the presence or absence of the first picture set with And the characteristic value in second picture set per pictures directly believes the picture of existing characteristics value from the pre-set source of houses Breath is read out in library, for there is no the pictures of characteristic value, then can download the picture from the Internet according to the URL of the picture, And calculate the characteristic value of the picture.
In practical application, when whether identify first source of houses and second source of houses is the identical source of houses, commonly known first room In the case of source, first determine whether to belong to same category of second source of houses in the presence of with first source of houses, and then determine this two again Whether the source of houses is identical.Therefore, in a kind of example, second source of houses can be determined according to the mark of first source of houses.Specifically, can To obtain the mark of first source of houses, which can be used for identifying the classification of first source of houses, for example can be by first source of houses Residing geographical location, the mark etc. as first source of houses looks into according to the mark of first source of houses of acquisition from information of real estate library Finding out has corresponding second source of houses of the mark, in order to subsequently determine whether first source of houses with second source of houses is mutually to have sexual intercourse Source.
S202:Determine the quantity of similar pictures in the first picture set, characteristic value and the second picture collection of the similar pictures Similarity degree in conjunction at least between the characteristic value of a pictures reaches preset condition.
When specific implementation, each pictures in the first picture set are directed to, it can be by the characteristic value of the picture and second The characteristic value of each pictures is compared in picture set, and whether judges the similarity degree between the characteristic value of two pictures Reach preset condition, if it is, showing that the characteristic value similarity of this two pictures is very high, namely shows this two pictures Similarity is very high, and then can indicate that there are certain pictures in second picture set, with first using the picture as similar pictures The picture similarity degree in picture set is very high;If there is no the characteristic value of a pictures and the picture in second picture set Characteristic value between similarity degree reach preset condition, then can determine in second picture set be not present and the first pictures The very high picture of the picture similarity degree in conjunction.
As an example, the similarity degree of the characteristic value between two pictures can be determined using Hamming distances.Tool Body, if the Hamming distances between the characteristic value of two pictures are not above preset second threshold, can determine this two Difference between a characteristic value is smaller, and similarity degree is higher, and the similarity degree for also showing this two pictures is higher;And if two Hamming distances between the characteristic value of pictures are more than preset second threshold, then can determine the difference between the two characteristic values Different larger, similarity degree is relatively low, and the similarity degree for also showing this two pictures is relatively low.Wherein, preset second threshold can be with It is set, is not limited herein according to the needs of actual conditions.
S203:If the quantity of similar pictures is more than preset first threshold, it is determined that first source of houses is phase with second source of houses The same source of houses.
It is appreciated that usually more than one, picture associated with first source of houses or second source of houses, and if the first room Source and second source of houses are the identical sources of houses, then in picture associated with first source of houses, it will usually be existed compared with plurality of pictures and with the The same or similar degree of the associated picture of two sources of houses is very high, and if first source of houses and second source of houses are not the same sources of houses, Similarity degree then between the picture corresponding to the two sources of houses is relatively low or without identical picture.Therefore, the present embodiment In whether can be the identical source of houses according to first sources of houses of quantitative determination of similar pictures in the first picture set and second source of houses, such as The quantity of fruit similar pictures is more than preset first threshold, then can determine that the two sources of houses are the identical source of houses, if similar diagram The quantity of piece is not more than preset first threshold, then it is the same source of houses that can determine the two sources of houses not.Wherein, preset first Threshold value can be set previously according to actual conditions, not limited herein.
As an example it is assumed that preset first threshold is 4, also, there are 6 pictures associated with first source of houses, divide It is not the parlor of first source of houses, master bedroom, secondary room, this 4 indoor pictures of kitchen, 1 house type picture and 1 source of houses external form Picture concurrently there are 7 pictures associated with second source of houses, be respectively the parlor of first source of houses, master bedroom, secondary room, kitchen, The picture of this 5 indoor pictures of toilet, 1 house type picture and 1 source of houses external form.If associated with first source of houses It is similar pictures there are parlor, master bedroom, secondary room, kitchen and house type this 5 pictures in picture, then shows first source of houses and the Parlor, master bedroom, secondary room, kitchen and the house type all same of two sources of houses then can be determined that first source of houses is with second source of houses at this time The identical source of houses;And if in association picture identical with first source of houses, it is similar pictures there are parlor, this 2 pictures of master bedroom, then Only parlor is identical with master bedroom between showing first second source of houses of the source of houses, and secondary room, kitchen, house type, external form etc. are different, It then can be determined that first source of houses and second source of houses are not the same sources of houses at this time.
It should be noted that preset first threshold can be preset fixed value, but in practical application, also may be used Being set according to actual conditions.In a kind of example, preset first threshold can be according to related to first source of houses The quantity of the picture of connection is determined, for example, if with totally 8, the associated picture of first source of houses, preset first threshold Value can be 7,6,5,4 equivalences, and if picture associated with first source of houses shares 5, and preset first threshold can be with It is 4 or 3 etc.;In another example, preset first threshold can be according to preset proportionality coefficient, and with first The source of houses and the product of the associated picture average value of second source of houses are determined.For example, it is assumed that preset proportionality coefficient is 0.5, if picture number associated with first source of houses is 12, picture number associated with second source of houses is 8, then in advance If first threshold be 0.5* ((12+8)/2), as 5.
In the present embodiment, it can obtain every in characteristic value and the second picture set in the first picture set per pictures The characteristic value of pictures, wherein the first picture set includes picture associated with first source of houses, second picture set include with The associated picture of second source of houses, it is then possible to determine the quantity of similar pictures in the first picture set, wherein every phase Like picture characteristic value and second picture set in similarity degree at least between the characteristic value of a pictures reach preset condition, If the quantity of similar pictures is more than preset first threshold, it can determine that first source of houses and second source of houses are the identical source of houses. As it can be seen that if two sources of houses are the identical source of houses, the quantity of similar pictures also can be very in picture associated with the two sources of houses It is more, therefore, similar pictures quantity in the association picture by determining two sources of houses, it can be determined that go out whether two sources of houses are phase The same source of houses, in this way, it may be determined that whether the information of real estate of a certain source of houses of publication described in broker has existed on source of houses website The information of real estate of the identical source of houses, or can be used for clearing up the information of real estate of the identical source of houses issued on source of houses website, from And the source of houses with identical information of real estate on duplicate removal source of houses website may be implemented, so that when user searches on the source of houses website When the rope source of houses, usually there is no the source of houses with identical information of real estate, users in the search result which is presented The information of real estate of more different sources of houses can be viewed on a display interface, and then improves user in the source of houses website The usage experience of the upper search source of houses.
In practical application, not only it can judge whether two sources of houses are mutually to have sexual intercourse according to picture associated with the source of houses Source, the character description information that can be combined with the source of houses further determine.Specifically, refering to Fig. 3, Fig. 3 shows of the invention real Applying a kind of method flow schematic diagram identifying the identical source of houses, this method in example may include:
S301:Obtain the character description information of first source of houses and the character description information of second source of houses.
It is appreciated that in the information of real estate that broker is issued on source of houses website, it will usually the word including the source of houses The character description informations such as description information, such as the construction area including the source of houses, geographical location, direction, price, cell name, If first source of houses is the identical source of houses with second source of houses, the character description information of the two sources of houses is generally also identical, and if First source of houses and second source of houses are not the same sources of houses, then the character description information of the two sources of houses can generally also have differences, because This can be used for determining first according to the character description information of first source of houses and the character description information of second source of houses Whether the source of houses and second source of houses are the identical source of houses.
S302:Obtain the spy per pictures in characteristic value and the second picture set in the first picture set per pictures Value indicative, the first picture set include picture associated with first source of houses, and second picture set includes related to second source of houses The picture of connection.
S303:Determine the quantity of similar pictures in the first picture set, characteristic value and the second picture collection of the similar pictures Similarity degree in conjunction at least between the characteristic value of a pictures reaches preset condition.
In the present embodiment, step S201 and S202 in the embodiment of step S302 and step S303, with a upper embodiment Embodiment it is similar, when specific implementation, can refer to the related place description in an embodiment, and this will not be repeated here.
S304:If the quantity of similar pictures be more than preset first threshold, and the character description information of first source of houses with The character description information of second source of houses is identical, it is determined that first source of houses is the identical source of houses with second source of houses.
It is not only associated with the two sources of houses if being appreciated that first source of houses and second source of houses are the identical source of houses The same or similar higher picture number of degree is higher in picture, moreover, the character description information of the two sources of houses is generally also Can be identical, can be more than preset first threshold, also, the two rooms in the quantity of similar pictures in the present embodiment therefore When the character description information in source is also identical, just determining first source of houses is the identical source of houses with second source of houses, so as to increase judgement Two sources of houses whether be the identical source of houses accuracy.
In the present embodiment, by combining the quantity of similar pictures and the character description information of two sources of houses, to judge the Whether one source of houses and second source of houses are the identical source of houses, in this way, it may be determined that broker to be issued the information of real estate of a certain source of houses, The information of real estate of the identical source of houses whether is had existed on source of houses website, or can be used for clearing up issued on source of houses website The identical source of houses information of real estate, so as to realize on duplicate removal source of houses website with identical information of real estate the source of houses, make in this way Proper user when searching for the source of houses on the source of houses website, in the search result which is presented usually there is no with The source of houses of identical information of real estate, user can view the information of real estate of more different sources of houses on a display interface, into And improve the usage experience that user searches for the source of houses on the source of houses website.
In addition, the embodiment of the present invention additionally provides a kind of device embodiment identifying the identical source of houses.It is shown refering to Fig. 4, Fig. 4 A kind of apparatus structure schematic diagram of the identical source of houses of identification, the device 400 can specifically include in the embodiment of the present invention:
Acquiring unit 401, for obtaining in characteristic value and the second picture set in the first picture set per pictures Characteristic value per pictures, the first picture set includes picture associated with first source of houses, the second picture set Including picture associated with second source of houses;
First determination unit 402, the quantity for determining similar pictures in the first picture set, the similar pictures Characteristic value and the second picture set in similarity degree at least between the characteristic value of a pictures reach preset condition;
Second determination unit 403, if the quantity for the similar pictures is more than preset first threshold, it is determined that described First source of houses is the identical source of houses with second source of houses.
In some possible embodiments, the characteristic value of the similar pictures and in the second picture set at least one Similarity degree between the characteristic value of pictures reaches preset condition, including:
Sea in the characteristic value of the similar pictures and the second picture set at least between the characteristic value of a pictures Prescribed distance is less than preset second threshold.
In some possible embodiments, the acquiring unit 401, including:
Lower subelements, for according to the network address per pictures in the first picture set, downloading the network address and corresponding to Picture;
Computation subunit, for the characteristic value of download pictures to be calculated in the first picture set;
Reading subunit, for from read in information of real estate library in the second picture set per pictures characteristic value.
In some possible embodiments, described device 400 further includes:
Adding device is used for when first source of houses is not the identical source of houses with second source of houses, then by described first Characteristic value in picture set per pictures is added in the information of real estate library.
In some possible embodiments, the acquiring unit 401 is specifically used for from information of real estate library described in reading Characteristic value in characteristic value and the second picture set in first picture set per pictures per pictures.
In some possible embodiments, described device 400 further includes:
Mark acquiring unit, the mark for obtaining first source of houses;
Third determination unit, for according to the mark, second source of houses to be determined from information of real estate library.
In some possible embodiments, described device 400 further includes:
Information acquisition unit, the verbal description letter of character description information and second source of houses for obtaining first source of houses Breath;
Second determination unit 402, if the quantity specifically for the similar pictures is more than preset first threshold, and The character description information of first source of houses is identical as the character description information of second source of houses, it is determined that first source of houses It is the identical source of houses with second source of houses.
In the present embodiment, if two sources of houses are the identical source of houses, similar diagram in picture associated with the two sources of houses The quantity of piece also can be very much, therefore, similar pictures quantity in the association picture by determining two sources of houses, it can be determined that go out two Whether a source of houses is the identical source of houses, in this way, it may be determined that broker to be issued the information of real estate of a certain source of houses, in source of houses website On whether have existed the information of real estate of the identical source of houses, or can be used for clearing up the identical source of houses issued on source of houses website Information of real estate, so as to realize on duplicate removal source of houses website with identical information of real estate the source of houses, so that when user exists When searching for the source of houses on the source of houses website, usually there is no believe with the identical source of houses in the search result which is presented The source of houses of breath, user can view the information of real estate of more different sources of houses on a display interface, and then improve use The usage experience of the source of houses is searched at family on the source of houses website.
It should be noted that each embodiment is described by the way of progressive in this specification, every embodiment emphasis is said Bright is all difference from other examples, and just to refer each other for identical similar portion between each embodiment.For reality For applying device disclosed in example, since it is corresponded to the methods disclosed in the examples, so description is fairly simple, related place Referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also include other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments enables professional and technical personnel in the field to realize or use the application. Various modifications to these embodiments will be apparent to those skilled in the art, as defined herein General Principle can in other embodiments be realized in the case where not departing from spirit herein or range.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest range caused.

Claims (14)

1. a kind of method identifying the identical source of houses, which is characterized in that the method includes:
The characteristic value per pictures in characteristic value and the second picture set in the first picture set per pictures is obtained, it is described First picture set includes picture associated with first source of houses, and the second picture set includes associated with second source of houses Picture;
Determine the quantity of similar pictures in the first picture set, the characteristic value of the similar pictures and the second picture collection Similarity degree in conjunction at least between the characteristic value of a pictures reaches preset condition;
If the quantity of the similar pictures is more than preset first threshold, it is determined that first source of houses is with second source of houses The identical source of houses.
2. according to the method described in claim 1, it is characterized in that, the characteristic value of the similar pictures and the second picture collection Similarity degree in conjunction at least between the characteristic value of a pictures reaches preset condition, including:
Hamming in the characteristic value of the similar pictures and the second picture set at least between the characteristic value of a pictures away from From less than preset second threshold.
3. according to the method described in claim 1, it is characterized in that, the feature obtained in the first picture set per pictures Characteristic value in value and second picture set per pictures, including:
According to the network address per pictures in the first picture set, the corresponding picture of the network address is downloaded;
The characteristic value of download pictures is calculated in the first picture set;
From the characteristic value read in information of real estate library in the second picture set per pictures.
4. according to the method described in claim 3, it is characterized in that, the method further includes:
When first source of houses is not the identical source of houses with second source of houses, then by every pictures in the first picture set Characteristic value be added in the information of real estate library.
5. according to the method described in claim 1, it is characterized in that, the feature obtained in the first picture set per pictures Characteristic value in value and second picture set per pictures, including:
From the characteristic value and the second picture set read in information of real estate library in the first picture set per pictures In per pictures characteristic value.
6. according to the method described in claim 1, it is characterized in that, the method further includes:
Obtain the mark of first source of houses;
According to the mark, second source of houses is determined from information of real estate library.
7. according to the method described in claim 1, it is characterized in that, the method further includes:
Obtain the character description information of first source of houses and the character description information of second source of houses;
If the quantity of the similar pictures is more than preset first threshold, it is determined that first source of houses and second room Source is the identical source of houses, including:
If the quantity of the similar pictures be more than preset first threshold, and the character description information of first source of houses with it is described The character description information of second source of houses is identical, it is determined that first source of houses is the identical source of houses with second source of houses.
8. a kind of device of the identical source of houses of identification, which is characterized in that described device includes:
Acquiring unit, for obtaining in characteristic value and the second picture set in the first picture set per pictures per pictures Characteristic value, the first picture set includes picture associated with first source of houses, and the second picture set includes and The associated picture of two sources of houses;
First determination unit, the quantity for determining similar pictures in the first picture set, the feature of the similar pictures Similarity degree in value and the second picture set at least between the characteristic value of a pictures reaches preset condition;
Second determination unit, if the quantity for the similar pictures is more than preset first threshold, it is determined that first room Source is the identical source of houses with second source of houses.
9. device according to claim 8, which is characterized in that the characteristic value of the similar pictures and the second picture collection Similarity degree in conjunction at least between the characteristic value of a pictures reaches preset condition, including:
Hamming in the characteristic value of the similar pictures and the second picture set at least between the characteristic value of a pictures away from From less than preset second threshold.
10. the device stated according to claim 8, which is characterized in that the acquiring unit, including:
Lower subelements, for according to the network address per pictures in the first picture set, downloading the corresponding figure of the network address Piece;
Computation subunit, for the characteristic value of download pictures to be calculated in the first picture set;
Reading subunit, for from read in information of real estate library in the second picture set per pictures characteristic value.
11. device according to claim 10, which is characterized in that described device further includes:
Adding device is used for when first source of houses is not the identical source of houses with second source of houses, then by first picture Characteristic value in set per pictures is added in the information of real estate library.
12. device according to claim 8, which is characterized in that the acquiring unit is specifically used for from information of real estate library Read the feature per pictures in characteristic value and the second picture set in the first picture set per pictures Value.
13. device according to claim 8, which is characterized in that described device further includes:
Mark acquiring unit, the mark for obtaining first source of houses;
Third determination unit, for according to the mark, second source of houses to be determined from information of real estate library.
14. device according to claim 8, which is characterized in that described device further includes:
Information acquisition unit, for obtaining the character description information of first source of houses and the character description information of second source of houses;
Second determination unit, if the quantity specifically for the similar pictures is more than preset first threshold, and described the The character description information of one source of houses is identical as the character description information of second source of houses, it is determined that first source of houses with it is described Second source of houses is the identical source of houses.
CN201810570338.9A 2018-06-05 2018-06-05 A kind of method and device identifying the identical source of houses Pending CN108763570A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810570338.9A CN108763570A (en) 2018-06-05 2018-06-05 A kind of method and device identifying the identical source of houses

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810570338.9A CN108763570A (en) 2018-06-05 2018-06-05 A kind of method and device identifying the identical source of houses

Publications (1)

Publication Number Publication Date
CN108763570A true CN108763570A (en) 2018-11-06

Family

ID=63999981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810570338.9A Pending CN108763570A (en) 2018-06-05 2018-06-05 A kind of method and device identifying the identical source of houses

Country Status (1)

Country Link
CN (1) CN108763570A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948644A (en) * 2019-01-21 2019-06-28 深圳壹账通智能科技有限公司 A kind of similar source of houses data detection method, device and terminal device
CN109977287A (en) * 2019-03-28 2019-07-05 国家计算机网络与信息安全管理中心 A kind of house property data identity method of discrimination of different aforementioned sources
CN110083733A (en) * 2019-03-16 2019-08-02 平安城市建设科技(深圳)有限公司 Picture examination method, apparatus, equipment and computer readable storage medium
CN110618982A (en) * 2018-12-26 2019-12-27 北京时光荏苒科技有限公司 Multi-source heterogeneous data processing method, device, medium and electronic equipment
CN110633383A (en) * 2019-09-12 2019-12-31 北京无限光场科技有限公司 Method and device for identifying repeated house sources, electronic equipment and readable medium
CN110633381A (en) * 2018-12-25 2019-12-31 北京时光荏苒科技有限公司 Method and device for identifying false house source, storage medium and electronic equipment
CN110807482A (en) * 2019-10-30 2020-02-18 北京创鑫旅程网络技术有限公司 Same house source detection method, device and storage medium
CN111260445A (en) * 2020-01-20 2020-06-09 北京无限光场科技有限公司 House resource information display method, device, terminal and storage medium
CN111259966A (en) * 2020-01-17 2020-06-09 青梧桐有限责任公司 Method and system for identifying homonymous cell with multi-feature fusion
CN111383032A (en) * 2020-02-12 2020-07-07 北京城市网邻信息技术有限公司 Method and device for detecting authenticity of house source information
CN111552869A (en) * 2020-03-31 2020-08-18 北京城市网邻信息技术有限公司 House source information display method and device
CN111737599A (en) * 2020-05-07 2020-10-02 北京城市网邻信息技术有限公司 Method and device for verifying house source object
CN112699289A (en) * 2020-12-30 2021-04-23 上海瑞家信息技术有限公司 House resource information aggregation display method and device, electronic equipment and computer readable medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919591A (en) * 2015-12-24 2017-07-04 北京奇虎科技有限公司 The product introduction method and device of website
CN107480203A (en) * 2017-07-23 2017-12-15 北京中科火眼科技有限公司 It is a kind of to be directed to identical and similar pictures duplicate removal view data cleaning method
CN107516105A (en) * 2017-07-20 2017-12-26 阿里巴巴集团控股有限公司 Image processing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919591A (en) * 2015-12-24 2017-07-04 北京奇虎科技有限公司 The product introduction method and device of website
CN107516105A (en) * 2017-07-20 2017-12-26 阿里巴巴集团控股有限公司 Image processing method and device
CN107480203A (en) * 2017-07-23 2017-12-15 北京中科火眼科技有限公司 It is a kind of to be directed to identical and similar pictures duplicate removal view data cleaning method

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633381A (en) * 2018-12-25 2019-12-31 北京时光荏苒科技有限公司 Method and device for identifying false house source, storage medium and electronic equipment
CN110633381B (en) * 2018-12-25 2023-04-07 北京时光荏苒科技有限公司 Method and device for identifying false house source, storage medium and electronic equipment
CN110618982B (en) * 2018-12-26 2022-09-30 北京时光荏苒科技有限公司 Multi-source heterogeneous data processing method, device, medium and electronic equipment
CN110618982A (en) * 2018-12-26 2019-12-27 北京时光荏苒科技有限公司 Multi-source heterogeneous data processing method, device, medium and electronic equipment
CN109948644A (en) * 2019-01-21 2019-06-28 深圳壹账通智能科技有限公司 A kind of similar source of houses data detection method, device and terminal device
CN110083733A (en) * 2019-03-16 2019-08-02 平安城市建设科技(深圳)有限公司 Picture examination method, apparatus, equipment and computer readable storage medium
CN109977287A (en) * 2019-03-28 2019-07-05 国家计算机网络与信息安全管理中心 A kind of house property data identity method of discrimination of different aforementioned sources
CN110633383A (en) * 2019-09-12 2019-12-31 北京无限光场科技有限公司 Method and device for identifying repeated house sources, electronic equipment and readable medium
CN110807482A (en) * 2019-10-30 2020-02-18 北京创鑫旅程网络技术有限公司 Same house source detection method, device and storage medium
CN111259966A (en) * 2020-01-17 2020-06-09 青梧桐有限责任公司 Method and system for identifying homonymous cell with multi-feature fusion
CN111260445A (en) * 2020-01-20 2020-06-09 北京无限光场科技有限公司 House resource information display method, device, terminal and storage medium
CN111383032A (en) * 2020-02-12 2020-07-07 北京城市网邻信息技术有限公司 Method and device for detecting authenticity of house source information
CN111383032B (en) * 2020-02-12 2023-11-14 北京城市网邻信息技术有限公司 Method and device for detecting authenticity of house source information
CN111552869A (en) * 2020-03-31 2020-08-18 北京城市网邻信息技术有限公司 House source information display method and device
CN111737599A (en) * 2020-05-07 2020-10-02 北京城市网邻信息技术有限公司 Method and device for verifying house source object
CN112699289A (en) * 2020-12-30 2021-04-23 上海瑞家信息技术有限公司 House resource information aggregation display method and device, electronic equipment and computer readable medium

Similar Documents

Publication Publication Date Title
CN108763570A (en) A kind of method and device identifying the identical source of houses
CN105302845B (en) Data information method of commerce and system
CN107968818B (en) Data storage method and device and server cluster
JP2002527806A (en) Determining where to place remote computers in a wide area network to conditionally deliver digitized products
CN109087163A (en) The method and device of credit evaluation
JP2019512806A5 (en)
JP2005135071A (en) Method and device for calculating trust values on purchase
CN108153824A (en) The determining method and device of targeted user population
CN107924396A (en) The adjustment of locally applied search result based on the affinity specific to user
CN108416630A (en) A kind of determination method and device of target audience
CN110083762A (en) Source of houses searching method, device, equipment and computer readable storage medium
WO2010096986A1 (en) Mobile search method and device
CN114969566B (en) Distance-measuring government affair service item collaborative filtering recommendation method
CN106682146B (en) Method and system for retrieving scenic spot evaluation according to keywords
US20170236224A1 (en) Identifying Points of Interest
JP2006331014A (en) Information provision device, information provision method and information provision program
CN110750238B (en) Method and device for determining product demand and electronic equipment
CN113849731B (en) Information pushing method, device, equipment and medium based on natural language processing
CN111915679B (en) Method, device and equipment for determining target point based on floor
CA3036869A1 (en) View scores
CN113706222B (en) Store site selection method and device
JPWO2004027668A1 (en) Real estate joint purchase matching system
CN106997340A (en) The generation of dictionary and the Document Classification Method and device using dictionary
CN110543593B (en) Data processing method and device, electronic equipment and readable storage medium
CN109670853B (en) Method, device, equipment and readable storage medium for determining user characteristic data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181106