WO1987001224A1

WO1987001224A1 - Fingerprint recognition and retrieval system

Info

Publication number: WO1987001224A1
Application number: PCT/US1986/001653
Authority: WO
Inventors: Malcolm K. Sparrow
Original assignee: Zegeer, Jim
Priority date: 1985-08-16
Filing date: 1986-08-13
Publication date: 1987-02-26
Also published as: EP0233919A1; BR8606834A; CA1270332A; KR940002361B1; EP0233919A4; AU6223686A; AU587152B2

Abstract

Fingerprints are scanned by a scanning system (13). Topological systems for coding and comparing fingerprints including a system for recording a description of fingerprints. In a preferred embodiment, a central point of the fingerprint is selected as a center of rotating scan line. The scan line is rotated to different topological characteristics. A code (T) representing the type of irregularity is recorded (16, 17). A measure (M) of the scanning position when encountering the irregularity is made (16, 17). In the case of a rotating scan line the angular coordinate ( theta ) is recorded. The ridge count (R) (16) is also recorded. A list of coordinate sets (T, theta , R) specifies the topology of a sector. For fragmentary prints similar coordinate sets are generated. A fourth coordinate can be added corresponding to the radial distance (D) (16). A full set of coordinates (T, theta , R, D) gives a complete topological and spatial description of a fingerprint. Comparison of fingerprints can then be conducted by a computer (20).

Description

FINGERPRINT RECOGNITION AND RETRIEVAL SYSTEM

Backaround and Brief Description of the Invention

The invention relates to fingerprint codinα and recognition and retrieval systems based on generally invarying topological irregularities, characteristics or minutiae (which terms are used interchangeably herein) of fingerprints. The term "fingerprint" or "print" is used in reference to the epidermal ridge lines of the ten fingers of the human hand, palm prints, toe and sole prints of humans wherein such epidermal ridge lines and characteristic features thereof are in patterns unique to a particular individual.

In my paper entitled "Digital Coding of Single Fingerprints — ft New Approach for the Computer Age", Journal of Police Science and Administration, Vol.X, No.2, June 1982, I show that the soft elastic nature of human skin causes substantial variation of the spatial descriptions of successive impressions of the same fingerprint. Consequent Iv, spatially based coding schemes used for forming machine searchable Patabases have inherent inaccuracies due to the fact that the spatial based coordinate system typically used for coding purposes could not take into account the wide variations in spatial distortions making the match or identification between two rolled prints on the same finger somewhat problematical particularly where the prints are taken at substantially different times or pressures.

TopoIogical coding schemes provide concise digital codes that provide a more economical and more reliable basis for ten print identification systems. In my above referred to paper, I suggest comparison methods based on topological coding of prints in which a topology base coding system for recording and comparing minutiae used vector arrays generated from topologically based coding of fingerprints.

According to this invention, each fingerprint is scanned by a scanning system which typically includes a scanning 'line' which sweeps in a predetermined manner, such as horizontally, vertically or radially, from a prescribed origin for the scanning system utilized. When the scanning line moves over an irregularity (such as a ridge ending, bifurcation, etc.), the irregularity is recorded by the use of at least three coordinates: a "type code (T) to particularly identify the irregularity, a measure (fl) of the scanning line position when it hits the irregularity, and a ridge count (R) which is the number of ridges intersecting the scanning line, at that position, between the irregularity and a prescribed point on, or origin for, the scanning line. A collection of coordinates sets (T. M, R) uniquely specifies the topology of a fingerprint or any part thereof.

Thus, the present invention provides a system for recording a complete topological description of a fingerprint subject to the constraint that each characteristic be in a given database, recorded once and only once. To form a library or database of topological coordinate sets for search purposes, rolled or file prints or so-called ten-print cards, are utilized and a central point on the fingerprint, (such as a core) is selected as a center of a rotating ridge scan line. When the scan line is a rotating ridge scan line, the rotating ridge scan line, which preferably has a center of rotation or origin which is just off of any ridge, is relatively rotated in a predetermined scan direction, clockwise, for example, to different topological characteristics (sometimes called irregularities or minutiae) of the fingerprint for a plurality of ridge lines. A hexadecimal digital code represent ino the type (T) of irregularity (ridge-ending, bifurcation, etc.) and the angular coordinate (θ) (which corresponds to the measure (M) of scanning line position) of the irregularity is recorded. In this case, the angular coordinate (θ) is sufficient to specify the order in which the irregularities are passed over by the sweeping or rotating ridge scan line. The ridge count (R) between the characteristic or the irregularity and the central observation point specifies the ridge on which the irregularity occurs. Thus, a list of coordinate sets of the form (T, θ, R) specifies the topology of any sector uniouely. A fourth coordinate is added to the coordinate set to correspond to the radial distance (D) measured from the central observation point. D and θ then specify the positions of the characteristics in space and the full coordinate set (T, θ, R, D) gives a complete topological and spatial description of a fingerprint which only requires 4 bytes per irregularity.

Prints such as latent prints found at the scene of a crime (SOC) ar e coded according to the same topological coordinate scheme. Computerized searching of such a latent mark against a large established collection of file prints is then performed through reconstruction of the topology local to each characteristic followed by comparison of such localized topology.

Additionally, fast comparison of rolled prints can be conducted on the basis of extracted vectors.

Topological vector extraction is based in part on the system disclosed in my above referred to paper. The core of the fingerprint is centrally located at a reference point and a horizontal line is projected through the core to intersect ridge lines to each side of the centrally located reference point. In the case of an arch a vertical line is drawn through successive ridge summits. From the points of crossing of the ridges with the projected horizontal or vertical line, the ridges are traced to the first significant irregularity and a type code (T) is assigned to the irregularity and with the distance (D) from the reference line and these data are recorded in a predetermined order to constitute a topological vector for the print which then is recorded in a machine searchable database. Comparison of vectors take the form of a sequence of array operations. Comparison of good quality rolled prints is performed extremely rapidly on this basis.

Brief Description of the Drawings

The above and other objects, advantages and features of the invention will be more apparent when considered with the following specification and accompany in g drawings wherein:

Figure 1(a) is a block diagram of a fingerprint recognition and retrieval system incorporating the invention; Figure 1(b) is a partial block diagram illustrating one form of image retrieval systems which can be used with the invention; Figure 1(c) is a partial block diagram illustrating the detail of the topological coordinate extraction system.

Figure 2 (a) is a schematic block diagram of a second form of topological coordinate extractor (which is semi-automatic) incorporating the invention; Figure 2(b) is a block diagram of an extractor of a second form of topological coordinates, where a different scanning pattern is being used,

Figure 3 is a schematic block diagram of a semi-automatic vector extractor incorporating the invention,

Figure 4 is a further schematic block diagram of a fingerprint recognition and retrieval system incorporating the invention, with a remote enquiry station,

Figure 5a illustrates a device for manually reading topological coordinate sets and Figure 5b illustrates the device orf Figure 5a in use with a print therein,

Figure 6 shows the line placing for vector extraction from a plain arch.

Figure 7 illustrates the ridge exploration event codes, for use with vector extraction,

Figure 8 shows the horizontal line placed on the ulnar loop on a ridge tracing for use with vector extraction,

Figure 9 illustrates the 82 digit vector generated or extracted from the ridge tracing of Figure 8, Figure 10a is a pair of sample latent marks (approximately 5x) and, Figure løb are examples of a latent mark (left) and its matching file print (right) the numbers are to corresponding features.

Figure 11 generally illustrates the sweeping or scanning line coding scheme,

Figure 12 (a) shows the radial irregularity centered lines with the "cut" vertically below the observation point, Figure 12(b) shows a horizontal scan line which is moved relatively vertically over the print, Figure 12(c) shows a vertical scan line which is moved relatively horizontally over the print,

Figure 13 illustrates a fingerprint after reconstruction from a topological coordinate set,

Figure 14 is a copy of the original fingerprint tracing corresponding to the print used in Fig. 13,

Figures 15 and 16 illustrates latent tracing (Figure 15) and, its reconstruction (figure 16),

Figures 17 and 18 illustrates reconstructions without (Figure 17) and with (Figure 18) defaulted edge-topology, Figure 19 is a fiow diagram of the latent mark matcher program LM6 (appendix A),

Figure 20 is a flow diagram of the vector matching algorithm MATCH4 (appendix B), and

Figure 21 is a fiow diagram of the image retrieval through topological reconstruction algorithm, PLOT1 (appendix C).

Detailed Description of the Invention

Figure 1(a) is a functional block diagram of a fingerprint recognition and retrieval system. File prints (which are rolled prints from a fingerprint card base, such as national, regional or local fingerprint files) are fed in stacks to a high speed card handler 10 which may incorporate a card identification number reader 11, which may be an optical character reading device for reading the card identification number as well as reading such number or numbers printed on the card used to identify the subject or person whose fingerprints appear on the card. This card number can later be used by the central computer to associate the fingerprint data with the descriptive data (name, address, type of offense. etc. or other data related to the subject) that would be used in limiting the field of search.

The fingerprints on the cards are passed through a scanner 12 which senses each of the ten fingerprints on the card and outputs a gray-scale point matrix. Scanner 12 can be one of many types such as a "flying spot scanner" or a solid-state scanner or camera. It examines a series of small areas (pixels) of the fingerprint in turn and, as it encounters white (uninked), black (inked) or gray (partially inked) areas it produces a signal representing the blackness of each pixel. Thus, an array of such signals is formed representing a series of discrete samples covering the whole print area. I n the art, this array is referred to as the "gray scale" image and each of the ten prints on a card are scanned in turn. The output from the pray scale scanner 12 is supplied to an image enhancer 14 which, likewise, is conventional in the art. Image enhancer 14 receives pray scale scanner output and forms it into a binary enhanced (black/white) image. In doing so, it compensates for variations in ink density over various portions of the print. The image enhancement process locates ridges and valleys and forms a binary image. Systems for determining whether each pixel is on a ridge (black) or in a valley (white) by reference to an examination of the apparent ridge flow direction in its vicinity and location of those apparent ridges, is well known in the art. In addition to formation of a binary black/white image, the image enhancement processor 14 also determines which parts of the print cannot be interpreted as ridge/valley structure (i.e. they are "unclear") and which parts display a corrupted (scarred) structure. In addition, it records and outputs the locations and extents of such areas. It can also output ridge direction data which shows tha approximate direction of the general ridge flow at each point.

The output from the image enhancer 14 is supplied to a topological coordinate extractor 16 and a vector extractor 17. The topological coordinate extractor 16 determines from the ridge flow data (using existing techniques such as are disclosed in U..S. Patents 3, 560, 928 and 4, 156, 230) whether or nor a central core exists in each pattern and will locate a position for a central observation point either close to the core (if there is one) or beneath the first upcurving ridge (in the case of an arch, which is the only pattern that does not have at least one core). If there is more than one core, then the one facing upwards will be selected. Having determined such a central observation point, then it will generate a set of coordinates of the form (T, M, R, D) for each irregularity in the binary image output from the image enhancement processor 14. Locations where ridges run into, or out of, "unclear" or "scarred" areas will be similarly recorded. The coordinate sets are as described later hereinafter. Figures 2(a), 2(b), 3, 4, 5, 12 (a), 12(b) and 12(c) illustrate manually operated topological coordinate and vector extractors and will be described in detail hereafter.

The vector extractor 17 generates a topological code vector, of a length 62 to 82 digits, (as described more fully hereafter) together with associated distance measures. The imaginary generating lines are placed as shown in Figure 8 if the pattern has a central core and as shown in Figure 6 if it does not. The presence or absence of such a core is determined from the ridge direction data, A manually operated vector extraction system is disclosed in Figure 3 and will be described more fully hereafter. The data from the topological coordinator extractor and/or the vector extractor are supplied to a general purpose digital computer 20 which stores the topological coordinate sets extracted by topological coordinate extractor 16 in a mass storage system such as a disc storage 21. The topological code vectors for each print extracted by vector extractor 17 is likewise supplied to the general purpose computer which stores this data in a further mass storage medium such as disc storage unit 22. These data storage devices hold the coordinate sets and extracted vectors in association with the card identifying numbers so that when being searched and a match is made, the card identifying number is associated with the topological coordinate set in storage unit 22.

One or more ten print search inquiry terminals 25 are provided so that an operator can access the central computer 20 and instruct it as to the extent and nature of a search, required (i.e. restriction by reference to other descriptive data such as offense type, age, sex, race, or geographical data which may be pertinent to the inquiry). The terminal 25 incorporates a fine graphic display facility sufficient to show the operator any fingerprint reconstructions which the operator may request and which will be outputted from the image retrieval processor as described more fully hereafter. Latent marks such as disclosed in Figures 10a and 10b are first enlarged and a manual tracing is made at block 26. This unit provides the manual tracings for the latent inquiry terminal 29 whrch enables the latent examiner to enter the coordinate sets he has read manually and to initiate a search of the database 22. It incorporates a fine graphic display facility sufficient to show the operator any fingerprint reconstruction, such as shown in Figures 17 and 18 which the operator may request and which will be the output from the image retrieval processor.

The general purpose computer 20 handles the receipt and storage of all incoming data, administration of the databases, and performs searches of the databases either by passing coordinate sets to the latent matcher system 30 or by passing the extracted vectors to the vector matcher 31. In the case of ten-print inquiries, computer 20 will determine how many of the ten-prints available on the search card should be used (according to the priority or importance of the search) and combines the separate finger scores outputs from the vector matcher to give an overall score for each candidate file print. This computer 20 also displays results in the form of a list of top rank candidates from the database to the inquiry terminals 25 or 28 upon completion of a search. If a request is made for a fingerprint image reconstruction, the appropriate coordinate set is read from the data file and passed to the image-retrieval processor 30. The image-retrieval processor can be a parallel implementation of the image reconstruction program (PLOT1) set forth in appendix C attached hereto. Its input will be a coordinate set which is passed to it by the central computer 20 when the request is made from an inquiry, terminal 25 or 29. The output is a line segment picture (figures 17 and 18) representing a reconstructed image of the fingerprint in question and it is displayed by the graphic facility of the terminal where the request is made. This program contains a sub-routine called "continuity" which effects the reconstruction of the topology of the print; it also contains three sub-routines called "smooth", "untangle" and "gap-fill" which perform linear smoothing operations by adjusting the paths of the ridges between characteristics. Other smoothing algorithms (such as the use of spline interpolation techniques) may be substituted for these three sub-routines.

The latent matcher 30 performs comparisons of two sets of coordinates sent to it by the central computer 20. It is preferably a parallel implementation of the latent matcher 30 which is set forth m appendix H. It returns a score to the central computer 20 which is a percentage in the range of 0 to 100 reflecting the similarity of the two prints reoresented by the two coordinate sets. The vector matching unit 31 compares vectors as directed by central computer 20 when a ten print search is underway and it uses the program attached hereto and identified as appendix B. The score returned is a percentage (a real number in the range 0 to 100) which reflects the similarity of the two prints represented by the two vectors.

IMAGE RETRIEVAL

In Fig. 1b, the gray scale point matrix from scanner 13 is fed via one side of selection switch SSW to compression processor 20CP and then stored as a computer compressed image in optical disk storage unit ODS to form a library of compressed print images. When it is desired to display a particular image, the image for any given print is retrieved from storage unit ODS for display at remote terminal 25. Instead of compressing a gray scale image, selection switch SSW can connect a skeletonized image to compression processor 20CP which is then stored in optical disk storage unit ODS. Image retrieval requires a decompression by decomcressipn processor 20DP and the resulting" image is displayed on the display at remote or local terminal 25. Finally, as will be described in detail later herein, topological reconstruction of the image for display can be performed and this requires significantly less storage space to produce fingerprint images at remote terminals with relatively small amounts of data transmission.

There follow Petailed descriptions of the major sections of the invention, namely:

(1) Details of the method for deriving topological code vectors from rolled prints.

(2) Details of the algorithm MP1TCH4 (ftppendix B) which is a series of array operations for comparing such vectors.

(3) Details of the topological coordinate scheme as it relates to file print coding.

(4) Details of the process of topological reconstruction from such coordinate sets, and the latent matching algorithm LMS.

The evolution of these systems is traced in great detail in the accompanying doc- ument entitled "Topological Coding of Single Fingerprints." EXTRACTING THE TOPOLOGICAL CODE VECTORS

The coding is in two oarts — 1. Coding the topology. 2. Measuring the associated distance.

1. CODING THE TOPOLOGY.

Rules are established, dependent on the pattern type, for the superposition of a line on each print.

(a) LOOPS: By looking at the whole available print, and with particular reference to the first flexion crease and the directions of ridges which run close to it, estimate a 'horizontal' orientation for a straight line. ('Horizontal' means parallel to the apoarent direction of the flexion crease.) Place a horizontal line through the loop core-centre, using the conventional rules for precise location of the core-point. (See figure 8.)

(b) ARCHES: Orient the print, again so that the flexion crease appears horizontal. Draw a flexible line vertically through successive summits of the ridges as shown in figure

6. The line starts at the lowest visible ridge above the flexion crease and follows the 'summit' route to the top of the available picture. (c) WHORLS & OTHER TYPES: Locate a 'core' using an adaptation of the rules for 100ps, and place a horizontal line as for a loop.

The placing of lines forms an ordered set of intersection points (where the line crosses a ridge), each one located on one of the ridges of the print.

Each point of intersection gives two 'directions' for topological exploration of that ridge: imagining oneself (just for a moment) to be a tiny insect capable of walking along a ridge' — then one could walk each ridge in each of two directions from the point of intersection. We stipulate that the walking (or exploration) will cease as soon as one of a number of specific 'events' is found.

Assignment of digital codes to the different possible ridge-exploration events leads to formation of a pair of digits for each point of intersection. Writing them down in order generates a digital vector of length equal to twice the number of points of intersection.

Figure 7 shows the digital codes selected to correspond to possible ridge-exploration events. In each case the ridge being exolored is marked with an arrow to show the direction of the exploration. The digital codes take the form of hexadecimal integers, and are always processed as such. Storage space required for each one is therefore only 4 bits, making it possible to compress one pair of digits into one byte. Not all 16 hex-digits are used; 1, 9, D and E being 'spare'. ' F' is used for padding the vectors up to a certain length for storage m a standardized data format.

Codes 6 and 8 record events that do not actually occur on the ridge being explored. They record the start of a new ridge either on the immediate left or the immediate right of it. The main reason for their inclusion in the scheme is that they record the presence of ridge-endings which would otherwise be ignored by the coding process. (This is because the ridge-ending belongs to a ridge that does not have a point of intersection with the generating line.)

The allocation of particular digits to particular events is not quite arbitrary. The tendency of inking and pressure differences between successive impressions of a print to cause topological change is well known. Bifurcations will mutate to ridge-endings, and vice versa. In anticipation of this phenomenon the digital codes are selected in order that some sense of closeness is carried over to them. The extent of that closeness is only that event 3 is liable to change to or from either of events 2 or 4; likewise event 7 is liable to change to or from events 6 or 8.

TOPOLOGICAL VECTOR EXTRACTOR (FIGS. 3, 6, 8 AND 9)

The topological vector extractor shown in Fig. 3 includes terminal 200 having a display 201 and a control console or 'mouse' 202 connected to terminal 200 by cable 202C. A fingerprint (or portion thereof) 203 is shown having a horizontal line 205 positionable by a thumb wheel control 206 to pass through a central portion CP of the print 203 and cross ridges in the pri nt to the l eft and r i ght of the centra l port i on CP. At each point where the line 205 intersects or crosses a ridge, the ridge is tracked in both directions to where an irregularity is encountered. A track ball 207 is used to locate a cursor on the origin CP, and the cursor is moved from the line 205 along each ridge by the track ball 202. A start distance measure switch 208S is operated to indicate the start of distance measurement and a stop distance measure switch 208SP is operated to denote the positioning of the cursor on the irregularity. Where an irregularity is encountered, the measure of distance is entered as the distance of movement of the cursor from line 205 to the irregularity. Pushbutton array 209 is used to enter a code for the type (T) of irregularity encountered. The vector data can be outputted from terminal 200 by data coupler 200C to general purpose computer 20. Figure 8 shows the tracing of an ulnar loop generated by exploration from a horizontal line through the core. Points of intersection are shown numbered outwards from the core, and characteristics accessed are highlighted with a small 'blob'.

A standard length for digital vectors was set at 82 digits --- that is, 41 pairs — of which 20 pairs represent up to

20 ridges on the left hand side of the core, one pair represents the ridge on which the core itself is located, and the other twenty pairs represent up to 28 ridges intersected on the right of the core. Whenever less than twenty ridges are intersected on the left or the right hand side of the core (which is usually the case) the 82 digit code is padded with ' F' s, as mentioned above, to bring it up to the standard length. The padding is done at the extreme ends of the vector in such a way that the digit pair representing the core-ridge remains in the central position (i.e. the 21st digit pair).

The convention is established that the digit representing exploration along a ridge upwards from the line is to be written first (of the pair), and the digit representing exploration downwards along the same ridge is written second. Adhering to that convention, the 82 digit vector generated from the tracing referred to above (figure 8) is shown in figure 9. To facilitate interpretation, the intersection point numbers (from figure 8) are shown also, with their corresponding digit pairs. (These intersection point numbers are not normally recorded, and they form no part of the topological code.) Digit pairs are juxtaposed, and each pair separated from the next. It is important to rememper that each digit pair is just that — a pair of digits; they should never be interpreted together as being one number.

2. MEASURING THE DISTANCE.

The measuring scheme adopted is quick and simple. It gives one hexadecimal integer as a 'distance measure' for each hexadecimal event-code.

The distance is measured from each 'ridge event' to the generating line. The measuring is not "as the crow flies" but rather "as the insect walks" (assuming that insects walk along ridges). Distances are measured along the relevant ridge from the intersection point on the generating line to the ridge-event.

The distance is measured (on a 10x enlargements) in centimeters, and is then rounded down to the nearest integer, and an upper bound of 15 imposed. On the actual print, therefore, the distance measures represent the distance, measured along ridges, from generating line to ridge-event, rounded down to the nearest millimetre. Thus the only possible distance measures are the integers 0, 1,2.... 15.

If the ridge-event codes are any of the set 0, A or B then the corresponding distance measures are set to a default value of 15. These codes 0 ('out of sight'), A ('scarred tissue') and B ('unclear') cannot really have meaningful distance measures associated with them: all the other event codes can.

Restriction to hexadecimal distance measures does mean that an event code, together with its distance measure, can be stored in 1 byte of memory. The storage requirement for each print code is therefore 82 bytes.

Each print has one hexadecimal distance measure for every event code. A single print is thus represented by an array (size 82 x 2).

Although this description has specified that distances should be 'ridge-traced' it is not absolutely necessary, although αesirable. The linear distance from each ridge- intersection point to each ridge-exploration event could be used instead. Similarly, the perpendicular distance from each ridge-exploration event to the generating line could be used instead.

The above description also suqgests an array length of 82. System design constraints may determine some other length, but greatest accuracy is achieved when the length exceeds 62.

THE VECTOR COMPARISON ALGORITHM MATCH 4.

There are ten distinct phases to this algorithm; two are preliminary and eight form the actual comparison process. Each will be described in turn.

Preliminary stage 1 --- fileset analysis

Fileset analysis is the first preliminary operation conducted by MATCH4 before any individual vector comparisons are made. It analyses the topological codes only (disregarding distance measures). The vectors stored within the fileset are of length 82 digits, representing up to 41 ridges. The 82 ridges (in order, from left to right) are divided up into ridge bands.

The ridge-band width for this analysis is a parameter of the programme. Let us suppose that this parameter (which will be called 'BANDWIDTH') is set at 5. Then, with vectors of length 82 digits, derived from 41 ridge intersection points, there will be 9 ridge bands. (These cover ridqes 1-5, 6-10, 11-15,... 36-40, and 41-45 respectively. Ridges 42-45 do not' exist', and so the ninth ridge band only contains the last (41st) pair of digits in each vector.) Each ridge band is analysed separately, as are the two directions (upwards and downwards from the horizontal line).

Simple code frequency analysis conducted on all the vectors stored in the fileset ultimately yields a real matrix P, of three dimensions thus:

P(j, k, l) : j = 0, 15 j represents one of the hexadecimal 'event' codes.

: k - 1, 9 k is the ridge-band number (numbered from left to right). : l = 1,2 I shows one of two 'directions'.

(I - 1 for 'upwards': i.e.first digit of a pair.) (l = 2 for 'downwards': i.e.2nd digit of a pair.)

The combination of any value of k with a value of 1 specifies one of 18 possible 'ridge areas'. P(j,k, l) is the proportion of codes m the (k, l) ridge area that had the value j.

Clearly 0 ≤ P(f, k,l) ≤ 1 for all (j, k, l). Also ∑i- P{j,k,l) ~ 1.0, for any fixed pair (k, l).

PRELIMINARY STAGE 2 --- SETTING UP THE SCORE-REFERENCE MATRIX.

From the three dimensional frequency matrix P, a four dimensional Score-reference matrix, S, is constructed, S is to be regarded as a ' look-up table' of initial scores to be awarded during the vector comparison process.

A score S(i,j,k. l) will be awarded initially when code i appears in the search vector opposite code j in the file vector, in corresponding (digit) positions which fall in the (k, l) ridge area . Again, this stage is not concerned with distance measures. That score S(ι,j, k, l) is an indication of the value of such a coincidence in indicating that the search and file vectors under comparison are matched. It could also be regarded as a measure of the unlikelihood of that coincidence occurring by chance had the file vector been selected completely at random from the population of 'all fingerprints'.

The calculation of the matrix S is done according to these rules:---

(a) For each i,j,k,l such that i = j and i,j € {0, 2, 3, 4, 6, 7, 8, C) then S[i,j, k, l) =mini mum ( BOUND , INT (10 x - log₁₀ P(j, k,l)])

where INT[...] means the integer part of [. . . ] and BOUND is another parameter — it is an imposed upper bound on the values taken by elements of the matrix S.. The factor 1 0 appears to avoid all the exact match scores being either 8 or 1. The inclusion of this factor gives a reasonable spread of exact match scores, based on code frequencies, despite the integer rounding. Typically these scores range from 1 to 15 or so.

These elements of S are the ' exact match' scores.

(b) For all i,j, k, l such that at least one of i and j is either 10, 11 or 12 (i.e. hexadecimal A, B or C), except for the case i = j = 12, then. S (i, j , k, l ) =0. 0

These elements of S represent all the appearances (either in the file vector or in the search vector) of the codes for scarred or unclear areas, and for compounds.

(c) The phenomenon of topological mutation (the changing of ridge endings into bifurcations and vice versa) relates to the selection of event codes. The pairs of codes {(2, 3), (3,4), (6,7), (7, 8)} can be regarded as' close matches' as they could be observed in corresponding positions within mated vectors as a result of topological mutations.

Consequently if the comparison algorithm is to recognise close matches as indications of a possible match (albeit not as strong an indication of this as exact matches would be) that policy can be effected by allocating positive values to the subset of S defined:

(S(i,j,k,l) such that the unordered pair (i,j) belongs to the set of unordered pairs { ( 2,3), (3,4), (6,7), (7,8))}.

This set of elements within S are hereafter called the close match scores. For any particular (k, l) they will appear as entries in the (i,j) table which are just off the leading diagonal. The entries of the leading diagonal itself are the exact match scores.

(d) For all i,j,k, l not covered by one of the rules (a), (b) or (c) above;

S(i, j, k, l)= -1

The matrix S (when there are 9 ridge bands) could be regarded as 18 different comparison tables each one of which might typically appear as shown below. (Here the close match scores have been set to 2 and an upper bound of 15 applied. Also the exact match scores have been rounded to the nearest integer for ease of presentation.) j

COMPARISON STAGE 1 --- FORMATION OF FILE AND SEARCH MATRICES.

The vector comparison process itself begins with a file array (B(i)t i - 1,82), a search array (A(i): i = 1,82) and the established score reference matrix S.

An important parameter not yet introduced is "MAXSHIFT". MAXSHIFT is the maximum number of ridge shifts (either to left or right) that is to be anticipated by the comparison algorithm. Such shifts are likely to have occurred as a result of distortion caused by core misplacement, appearance or disappearance of subsidiary ridges and line placement errors.

Let. us suppose that up to 5 ridge shifts should be anticipated i.e. MAXSHIFT=5). Then comparison of array A with array B will need to allow for relative shifting by up to five digit-pairs. This is accomplished by use of standard array processing techniques as follows:

(a) The topological vector portion of the search array A is used to construct the search matrix "C". C will have 82 columns and the number of rows will be given by L (2xMAXSHIFT) + 1]. Each row will be a copy of the topological part of the array A, but the copy will be progressively shifted to the left or right by from 0 to MAXSHIFT digit pairs. The central row will be an exact copy of A. The top (first) row will show A shifted 5 digit pairs to the left; the second row...4 digit pairs to the left; the bottom row...5 digit pairs to the right. Some digits of A may be ' lost' off the ends of some of the rows — and gaps caused by the shifting are padded with pairs of ' F' s.

(b) The file array B is used to create a file matrix, D, of identical dimensions to C. It is formed by faithful duplication of the topological part of the arrav B, without shifting, the appropriate number of times. Every row of D is an exact copy of the vector from B. No padding is needed, and no digits are lost from row ends..

COMPARISON STAGE 2 --- COMPARISON OF FILE AND SEARCH MATRICES.

The search and file matrices, C and D, are then compared element by element, and the initial score matrix is formed as the result. The initial score matrix will be called E. E has the same dimensions as C and D.

For each value of r and s the element E(r, s) depends only on C(r, s) and D(r, s) . Each element E(r, s) is evaluated by 'looking up' C(r, s) and D(r, s) in the score reference matrix S:

E (r, s) = S{i,j, k, I) where i = C{r, s) j = D{r,s) (k,l) are determined by s. k and I are picked, for each s, to represent the

'ridge-area' to which the 's'th element of a vector would belong.

Thus k will increase from 1 to 9 as s varies from 1 to 82, and 1 will be 1 if s is odd, 2 if s is even. In other words C(r, s) and D(r, s) are 'looked up' in the 'book' of comparison tables called' S'. The values (k, l) are evaluated (from s) just to make sure that the appropriate table is ' looked up'.

PROPERTIES OF THE INITIAL SCORE MATRIX.

The feature of the initial score matrix E that begins to

presence (or absence) of horizontal strings of non-negative scores. Such a string within one row of E. represents similarly placed rows within matrices C and D that were similar, or identical. Such strings, in turn, represent parts of the vectors A and B that were similar or identical. Where a high scoring continuously non-negative string occurs in the central row of E then vectors A and B are probably mates, and are correctly aligned. If such a high scoring string appears in one of the other rows of E, then A and B were probably mates, but incorrectly aligned (i.e. there had been some shifting error).

If, on the other hand, the matrix E appears to be a random scattering of scores with no discernible concentrations of non-negative scores, then it is likely that A and B were not mates. The task facing the remainder of the algorithm is to calculate a single score which will show whether significant strings are present in the matrix E, or not — and thus provide an indication of whether A and B are mated vectors.

The methods used to do this are based on the idea of adding together all the digits of each continuously non-negative horizontal string within E. The scores allocated (S(r, s)) for each exact match (when F(r, s) = D(r, s)) are logarithms of measures of the unlikelihood of such coincidence occurring by chance. Consequently the sum of a continuous series is a measure of the unlikelihood of that whole series occurring by chance. Typically non-matches are unlikely to display any continuously non-negative series of length greater than 6 digits. Matches can produce such series of lengths up to 50 or 60 digits.

Anticipation of this 'adding together' was the origin of the rules used in setting up the score matrix S. The significance of scores of 0.0 (rule (b) in preliminary stage 2) is that their appearances within the initial score matrix E do nothing to the sum of a series, but they do preserve its continuity. Thus, appearance of scars, or inability to determine what does happen first during ridge exploration, is not given any significance in indicating a match — but it is not allowed to break up an otherwise continuous non-negative sequence that would be indicative of a match. Hence the 0 . 0 allocation to any comparison involving codes 'A' or ' B'. Comparisons involving code' C' were also allocated scores of 0.0, because true compounds are very rare and what normally appears as a compound is usually an ambiguous characteristic of some other sort.

COMPARISON STAGE 3 --- APPLYING THE DISTANCE TESTS.

During a print comparison the distance measures are used in the application of three different tests. All three tests are applied to the initial score matrix in such a way as to reduce (tσ -1) any positive initial scores that the distance measure tests indicate ought to be so reduced. This will occur if the distance measure tests show that the matched event codes (which gave that positive value) are from' events' that are not roughly in the same area (spatially) of their respective prints.

ABSOLUTE DISTANCE TEST.

Before the matching algorithm accepts an event code in a file print array as possibly being correctly matched with an event code in the search print array — it now has to ask not only 'are the event codes the same?', but also a number of questions relating to their distance measures. The first is called the absolute distance tests

' Is the distance between the generating line and the ridge-event adequately preserved? (i.e. is it preserved within a given tolerance),. The tolerance allowed becomes a parameter of the programme and is called the absolute distance tolerance (ADT).

DIFFERENTIAL DISTANCE TEST.

If two events from adjacent ridges on the file print seem to match two events on adjacent ridges on the search print (where, in each case, both events lie on the same side of the generating line) then we should ask the question:

' Is the difference in their distance measures adequately preserved?'

The tolerance allowed in this test is another parameter, called the differential distance tolerance (DDT).

The difference between distance measures on adjacent ridges, looking in the same direction (i.e. the same side of the generating line) is a measure of the distance between the two events seen on those ridges — and is independent (except for rounding errors) of the exact position of the generating line. If this differential distance is not preserved then one, or other, of the two events cannot be correctly matched; they cannot both be right. SUMMED DISTANCE TEST.

If two events on the same ridge (i.e. both halves of a digit pair) seem to be matched from search to file print, then the sum of their distance measures should be preserved (within certain tolerance). That sum represents the total distance, along the relevant ridge, from one event to the other. The measures are added because the events are appearing on opposite sides of the generating line. Again, if this sum is not preserved then one event, or the other, is not correctly matched; they cannot both be.

The tolerance allowed in this case is called the summed distance tolerance (SDT).

HOW THE DISTANCE TESTS ARE APPLIED.

The distance tests are applied as the first filtration step for the initial score matrix E--- The manner of their application (briefly) is as follows:---

(a) Absolute distance test: every positive element, E(r,s), of the initial score matrix E is derived by comparison of C(r, s) and D(r, s) — elements of the search and file matrices. We call the related distance measures C' (r, s) and D' (r, s) respectively. The rule for the absolute distance test is:

If |C'(r, s) - D' (r, s )|) ADT then change E(r, s) to -1.

(b) Differential distance test: whenever E(r,s) and E(r,s + 2) are positive elements within E then

If | (C'(r, s) - C (r, s +2)) - (D' (r, s) - D' (r, s +2))|) DDT then change one of E(r, s) and E (r, s + 2) to -1. (Which of the two is reduced depends on other neighbouring elements within E.)

(c) Summed distance test: whenever E(r, 2s) and E(r, 2s-1) are both positive elements within E, then

If |(C'(r,2s) + C'(r, 2s - 1)) - (D' (r, 2s) + D' (r, 2s -.1))|> SDT then one of E(r, 2s) and E(r, 2s - 1) is reduced to -1. (In this case the largest of the two is reduced.)

COMPARISON STAGE 4 --- FILTERING FOR DEPENDENT PAIRS.

The repetition (from the search vector to the file vector) of a dependent pair of digits is less significant in indicating a possible match than independent repetitions of those two codes would have been. (A dependent pair occurs whenever the same characteristic is observed during the exploration of two adjacent ridges.) There may then be scores E(r, s) and E(r, s + 2) within the matrix E that form part of a continuously non-negative series, but whose appearance stems from repetition of a dependent pair of codes. Whenever such scores occur, their sum [E(r, s) + E(r, s + 2)] is more weighty than is appropriate in view of that dependence.

The matrix E is therefore filtered, and the filtered score matrix (F) created. F has exactly the same dimensions as E, D and C The filtering step involves. a reduction of scores stemming from repetitions of dependent code-pairs. It is accomplished by reference to the matrices C and D (to identify exactly where such pairs appeared in both).

The rule for score reduction is wherever E(r, s) and E(r, s + 2) are exact-match scores derived from a dependent pair then: F(r, s) = min(E(r, s), E(r, s + 2)) F(r, s+2)=2.8

Elsewhere F(r, s) = E(r, s).

This reduction of scores gives a more reasonable weighting to the scores derived from dependent pairs, in the light of the results of the analysis on pair dependency. The step typically reduces about 2 entries per row of the matrix E.

COMPARISON STAGE 5 --- CONDENSING DIGIT PAIRS TO A SINGLE SCORE.

Careful examination of a large number of filtered score matrices derived from mated vector pairs revealed that the fairly long continuously non-negative strings were not the most telling feature of the matrices; as well as revealing these completely non-negative strings they also exhibited much longer mostly non-negative strings. These longer strings, even though they were interrupted by isolated -1's, seemed to be a better indication of match or mismatch by their presence or absence.

Often one digit of a pair (e.g. the 2nd digit) would be non-negative for several successive digit pairs, while the other digit of each pair scored -1. This will happen whenever the ridge pattern on one side of the generating line is well preserved, whilst being corrupted on the other side.

Prior to product evaluation the matrix F is therefore condensed into a matrix G (which has the same number of rows, but only half as many columns) in a manner which moves the emphasis onto the much longer mostly non-negative strings.

The condensing rule applied in MATCH4 is:

-1 if F(r, 2s - 1) and F(r, 2s) are both -1;

G (r,s) = Maximu m(F(r, 2a - l), F{r, 2s)) if one, and only one, is non-negative;

F(r, 2a - 1) 4- F(r,2s) if both art non-negative.

Thus isolated -1's ceasee to break up the long series that result from mated vectors. The sums of these long series from matches are expected to far outweigh the sums of any continuously non-negative series which occur by chance, (i.e. from a vector mismatch).

COMPARISON STAGE 6 --- 'HOPPING' IN THE CONDENSED MATRIX.

Final score evaluation in MATCH4 depends on the single highest-scoring series found within the condensed matrix. One possible effect of this is that some matches may have produced very long strings which were broken up by isolated negative entries or ridge-shifts.

These string breaks may have occurred as a result of two topological mutations (one on either side of the generating line) that just haopened to affect the same ridge; that would cause an isolated negative entry in an otherwise continuously non-negative string in the condensed matrix. Alternatively ridge-shifting

(with its variety of causes) may have occurred; this will break the string as a result of inclusion or deletion of a digit pair from one of the vectors under comparison. The result will be that part of the string in the condensed matrix is displaced either to the row above, or to the row below (as shown here). . .. -1 1 -1 -1 -1 1 2 -1 -1 0 0 -1 -1 -1 0 ...

.. 0 2 5 6 -1 -1 2 -1 0 0 -1 0 -1 -1 -1 ...

.. 75 30 50 10 5 6 45 30 5 3 -1 0 -1 -1 -1 ... . .. -1 10 -1 -1 -1 3 0 -1 2 -1 -1 26 75 30 11 ... ... 1 0 3 4 -1 -1 2 1 0 -1 -1 -1 0 0 -1 ...

Part of a condensed matrix showing a suitable 'hopping'place. The algorithm is designed to recognise this phenomenon, and to put these broken strings back together again (i.e. to evaluate their sums as if they had not been broken). A parameter "HOPS" is used, which indicates the maximum number of breaks which can he overlooked in evaluation of any one series score.

The score evaluation then finds the highest scoring string that can be found in the condensed matrix if up to HOPS number of breaks (of specified kind) can be ignored in each string.

The parameter is called "HOPS" because, in effect, the programme is allowed to hop from the right hand end of a series onto another point where that string is thought to be continuing. The permissible hops in the condensed matrix G are from any point g(r, s) to any one of these three points: —

(a) g(r,s + 2): this simply bypasses an isolated negative element in an otherwise continuously non-negative series.

(b) g(r + 1, s + 2) or g(r - 1, s + 1): these are the hops required to repair a string break caused by insertion or deletion of one digit pair from the search or file vector. (To see why these particular hops are appropriate one must study the effect of ridge shifting on the staggered search matrix C.)

These three particular hops are not the only ones that could have been allowed; hopping from g (r, s) to either of g (r + 1, s + 3) or g (r - 1,s + 2) can be useful in repairing breaks caused when the generating line passes the wrong side of a bifurcation. The selection of the three described above, however, has been found to be the most effective selection in aiding match scores without unnecessarily aiding mismatch scores.

These three different types of hop can be combined in any one string — although compounding hops simultaneously to make longer hops is not allowed! If, for example, HOPS = 5, then the final score should represent the sum of the highest scoring string that can be found in the condensed matrix G, allowing up to five different hops per string, any one of which can be of any one of the three types described.

The calculation of such scores is accomplished by a further series of simple array operations. They are not described here. It is worth pointing out that the number of operations required for this step increases linearly with the value of HOPS, and not exponentially as might have been expected. In the algorithm for MATCH4 the hopping section is one single iterative loop, which is repeated HOPS times. It is bypassed whenever HOPS is set at zero. COMPARISON STAGE 7--- PRODUCT CALCULATION AND SCORE FORMULATION.

Formulating a score from the condensed matrix G provides a further variety of options. Score evaluation is made dependent on the single highest-scoring series in the condensed matrix rather than on a combination of all the different string sums. The best series invariably scored so much higher than all the others that it rendered them almost insignificant. Ignoring strings other than the best one obviates the need to take antilogs, add, and then reconvert to logs.

The score thus obtained is logarithmic in nature.

COMPARISON STAGE 8 --- SCORE NORMALIZATION PROCEDURE.

Examination of the lower match scores from MATCH4 showed that they were often produced when the search prints had been of relatively low quality: some were badly scarred (producing many ' A' s in their vectors) and others were not clear in parts (producing many 'B's). With high proportions of ' A' s and ' B' s present — and perhaps with a high proportion of ridges running 'out of sight' — large scores were just not possible, even if that vector had been faithfully reproduced within the file set.

The intention of score-normalization was to adjust scores from each comparison according to the amount of, or lack of, good information in both the search and file prints. The justification for such a procedure lies in this argument: if a search vector contains little information and a large part of it is found in a file vector, then this may be just as significant (in indicating a possible match) as had the search vector had plenty of information, only a little of which had appeared in the file vector. A mediocre score from a poor print is better than a mediocre score from a good print.

The method used in MATCH4 was to compare the search and file vectors each with itself (using the matching algorithm) and see what scores were obtained. Those scores are a very meaningful indication of the quality (i.e. rarity) and quantity of information in the search and file vectors. They represent the sum of one continuous string in the condensed matrix which covers the whole length of the vectors. They are for each, the perfect score. They are the maximum that could possibly be achieved by any vector compared to them. This would not need to be done for the file vector every time a search was conducted; each file vector would have its self-mate score calculated just once when it was introduced to the collection; the self-mate score would then be stored along with the file vector, and it would be referenced each time that file vector was used in comparison. A file vector's self-mate score would have to be recalculated only when the scoring system, for that file, was reappraised by a new fileset analysis. Suppose there were n vectors in the file — called

B ₁ ...B_n . Suppose perfect scores obtained for each by self-matching were called R _i , i = 1, n. Suppose, further, that a particular search vector A _j gave a perfect self-match score of Q _j and that A _j compared with B _i gave a raw score (i.e. not normalized in any way) of T _ij.

Then the normalization used gives a final score of:

This formula gives final percentage scores. Scores thus normalized appear. as real numbers in the range 0 to 100. Real numbers are only used at this very last stage of the comparison process. The raw score (before normalization) was an integer.

TOPOLOGICAL COORDINATE SYSTEMS

For the purpose of being able to search a large file for a fingerprint resembling a given fragmentary (or 'latent') scene-of-crime mark, a system for recording a complete topological description of a fingerprint, or of any part thereof, is needed.

In its most general form the system designed records the topological irregularities as a series of small changes in what is otherwise assumed to be a smooth laminar ridge-flow pattern. A 'sweeping line' or 'scanning' system, shown generally in Fig. 11, is used, whereby a scanning line SL moves across the fingerprint in a predetermined manner. Whenever it passes over an irregularity (which may either be a 'characteristic' or some other type of irregularity such as a ridge coming into sight, or going out of sight), the irregularity is recorded by the use of 3 coordinates:

(1) a type code (T) to show which type of irregularity it is.

(2) a measure (M) of the scanning line position when it hits the irregularity.

(3) a ridge count (R), which is the number of ridges intersecting the scanning line, at that position, between the irregularity and some predetermined point on the scanning line.

A collection of coordinate sets of the form (T, M, R) specifies the topology of a fingerprint (or any part thereof) uniquely. A fourth coordinate (D) may be added - which will record the actual linear distance between the irregularity and a predetermined point on the scanning line (which may or may not be the same point as that used in determining the ridge count). Then it will be seen that (D) and (M). together are sufficient to specify the spatial position of each irregularity. Thus coordinate sets of the form (T, M, R, D) give a complete topological and spatial description of the fingerprint scanned.

Such scanning systems could well take the form of a vertical line scanning horizontally across the fingerprint as shown in Fig.. 12(c). (In which case (M) would measure the horizontal distance moved by a scanning line, and ridge counts could be made using a point on the scanning line which was always below the entire visible fingerprint. (D) is then the vertical height, making the pair (M, D) analogous to cartesian coordinate pairs.) Similarity, it could take the form of a horizontal line scanning vertically as shown in Fig. 12(b). The system only requires that a scanning pattern be established, and that the coordinate (M) be used as a measure of how far the scan has progressed.

The particular scanning system selected as most suitable for use with fingerprints is the radial scanning line system, where the scanning line is pivoted at a selected (fixed) point on the print, and (M) takes the form of an angular measure (θ), where (θ) is the angle between tne position of the scanning line and some fixed predetermined direction. The 'pivot' point, in this case, is used as the predetermined point for measuring ridge counts (R), and for recording the distances (D) in the four coordinate system. The pair (D, θ) therefore becomes analogous to polar coordinates. The scanning pattern selected is simply a clockwise sweeo of the pivoted radial scanning line. TOPOLOGICAL COORDINATE SET EXTRACTION (FIGURES 2, 2(a),

2(b), 5, 11, 12(a), 12(b) and 12(c)

The topological coordinate set extractor shown in Fig. 2(a) has a split display screen 100, the upper screen 100U displaying the output from scanner 13 of Fig. 1, and a lower screen portion 100L which displays the enhanced or thinned image from image enhancement unit 14 of Fig. 1. The gray scale image in display 100U is used to allow the operation to locate voids, scars, discontinuities, and a count is made of the ridges running into and out of voids, scars, discontinuities, etc»

In this embodiment, scanning of the enhanced displayed fingerprint 101E is by a rotating ridge scan line 102 which begins its scan of print 101E from a predetermined scan start position or 'cut' 102S and rotates in a predetermined scan direction such as clockwise (see Fig. 12). The origin or observation point 104 is the center of rotation of scan line 101E and is shiftable by track ball 105 in controller 180, and as noted above, preferrably has the center of rotation or observation point 104 located just off of any ridge. The right side 100R of display 100 may be used together with the thumbwheel cursor controls 105 and 106 to cause a relative rotation of the scan line 101E, and provide a measure of the angular displacement (θ). Movement of a cursor point to the characteristic being coded will give a measure of the radial distance which can be calculated when a key on the key pad or pushbutton array 107 is depressed.

Thus, the origin of ridge scan line 102 is located at a central observation point 104 located just off of a ridge (see Fig. 12 and Fig. 5(b)) by rotating track ball 105, and thumbwheel 105 is turned to cause a relative rotation of the scan line 102 in a predetermined direction (clockwise, for example) to the first irregularity (which may be the ridge bifurcation indicated as I in Fig. 12(a)). The movement of thumbwheel 105 is a measure of the angle (θ) swept by the scan line 102 to the irregularity and the number of ridges crossed by the scan line from the central observation point to the irregularity which, in Fig. 12, is five (5). The count can be done by the operator or automatically. the automatic ridge count or the manual ridge count may be used by actuation of the override ridge count switch 108. Pushbutton array 107 is used to enter 1) the radial distance (D), 2) ridge count (R), 3) angle (θ) of the scanning ridge line 102 relative to the start or cut position 102S and 4) the type code (T). These parameters are displayed in the right hand display portion 100R.

The horizontal and vertical line scan system shown in Figures 12(b) and 12(c) may be implemented using a topological coordinate extractor shown in Figure 2(b) which, except for the use of horizontal and/or vertical scanning lines, is similar to Figure 2(a). As in Figure 2(a), a split screen 100' uses the upper screen 100U' to display the gray scale image to allow the operator to locate voids, scars, discontinuities, and obtain a count of ridges running into and out of the voids, scars, etc. The lower portion 100L' of the display 100' displays the thinned or enhanced image of the print from image enhancement unit 14 of Figure 1. The apparatus of Figure 2(b) provides the user with the choice of two (or more since the radial scan feature of Fig. 2(a) can easily be accomodated i n this unit) scan lines, horizontal or vertical which may be implemented automatically under computer control to perform the scanning functions illustrated in Figures 12(b) and 12(c).

The horizontal scan line HSL is activated by switch HC and the vertical scan line VSL is activated by switch VC. In each case, the scan line is moved across the print image ( horizontally for vertical scan line VSL or vertically) for horizontal scan line HSL) by scan thumbwheel STW. Points of tmngency are 'recurves' (labeled RCV or RCH in Figures 12(b) and 12(c), respectively) which are found wherever ridges are tangential with the scanning line. They are treated as irregularities and used for topological reconstruction purposes, either as preliminary to matching or pictorial reconstruction. Movement of a cursor from a prescribed origin "V" or "H" to a particular irregularity (or recurve) can be performed by cursor thumbwheel controller CTW. A code type is entered on pushbutton keyboard CKB. As the cursor is moved from the origin to the irregularity (IH and RCH of Figures 12(b) or IV and RCV of Figure 12(c)), the number of ridges intersected by the scanning line and the predetermined point ("V" or "H") is recorded automatically along with a measure (M) of the scanning line position when it meets the irregularity. As noted earlier, the actual linear distance between the irregularity and the point on the scanning line may be used to provide a fourth coordinate (D) which may be added information recorded concerning the irregularity. (When (M) is measured on the X-axis, (D) can be measured on the Y-axis.

In Figure 1c, there is illustrated a generalized block diagram of a topological coordinate set extractor system as an expansion on the block diagram of Fig. la. After the image of the print has been enhanced by image enhancement unit 14, the enhanced image is supplied to topological coordinate extractor 16 which includes minutiae detector 16MD, which is standard. Each detected minutiae is assigned a type code (T) by type code generator 16G which type code is supplied to coordinate set processor 20CS in general purpose computer 20. When the horizontal scan line HSL (Fig. 12b) is moved vertically, the scan locating data (M) of each successive characteristic is forwarded by detector 16MD to coordinate set processor 20CS. In regards to Fig. 12a, the angle (θ) constitutes the scan location data (M). In Fig. 12b and Fig. 12c, the distance from the Y-axis or the X-coordinate constitutes (M). Any blurs such as scars or voids found by scar/void detector 16V (by virtue of the absence of descendible ridge lines in a given area). When any scan line passes through a detected void or scar area, no ridges are counted wwithin that region. The points where ridges run into, or out of, such a region are recorded as irregularities. As noted in Fig. 1c, ridge counter 16R counts ridges from the central observation point 104 in the radial scan system (Fig.

2a) or from selected observation points V or H (Figs. 2b, 12b and

12c).

Distance calculator 16DC provides the distance measurement (D). In regard to Fig. 12a, the distance from the central observation point 184 to the irregularity is the distance

(D) along the radial scan line to the irregularity as computed by a simple arithmetic operation on the spatial coordinates X, Y; and in Fig. 12b and Fig. 12c, this is simply constituted by the X or Y coordinate for the minutiae position. In some cases it may be desirable to use more than one scan line, such as both horizontal and vertical scan lines to obtain more data, and obviously, the scan line need not be horizontal or vertical.

The methods for file print coding, latent mark coding, topological reconstruction and print comparison based on the particular selected scanning line system (radial scanning) will now be explained in turn.

COORDINATE SCHEME: FILE PRINT CODING - RADIAL SCANNING.

A central observation point is selected to be adjacent to the core in the case of loops and whorls, and at the base of the up-curve (the point at which a "summit line" can begin to be seen) on arches. It is preferred that the observation point be placed in a valley rather than on a ridge so as to give unambiguous ridge counts in every direction.

All of the irregularities are then recorded by sets of topological coordinates of the form (T, θ, R, D). The type of irregularity is shown by a single hexadecimal digit and the allocation of digits is closely related to the allocation already in use for ridge-exploration events. The list of possible irregularities, with their hexadecimal codes is given here. The descriptions can best be understood clearly if you think of these irregularities as being passed over by a pivoted radial line which is sweeping in a clockwise direction.

TABLE OF IRREGULARITY TYPES.

Code 0 — ridge runs out of sight.

Code 1 — ridge comes into sight.

Code 2 — bifurcation facing anticlockwise.

Code 3 — ridge ending.

Code 4 — ridge recurves with the effect of losing two ridges.

Code 5 — ridge recurves with the effect of gaining two ridges.

Code 6 — facing ridge ending (i.e. facing in the opposite direction to a '3'.)

Code 7--- bifurcation ahead (i.e. a' 2' reversed).

Code A — ridge runs into scarred tissue.

Code B — ridge runs into an unclear area.

Code C — compound characteristic (2 ridges in , and 2 ridges out). Code D — ridge emerges from scarred tissue ('A' reversed).

Code E — ridge emerges from unclear area. ('B' reversed).

LATENT SEARCHING: TOPOLOGICAL COORDINATE SYSTEMS.

The most desirable latent data form is a complete and objective description of the latent tracing. The tracing process itself still is, and always will be, substantially subjective — but it ought to be the last stage requiring subjective judgement. A set of topological coordinates of the form (T, θ, R, D), (showing type, angular orientation, ridge-count and distance) provides a complete topological and spatial description, and it therefore becomes the basis for latent data entry. The latent mark data can then be presented in much the same form as the file print data.

The manual latent data preparation process is fairly simple: first the mark is traced (enlarged to 10x magnification). Then the position of the central observation point is guessed by the fingerprint expert, and its position marked on the tracing. The assumed core point position may be some way away from the ' visible' part of the latent. Then the correct orientation of the mark is estimated by the expert, and the coordinates of the characteristics, and other irregularities can then be written down. Fig 5a discloses an extremely useful tool, for this operation. A large board (50) with a pin hole (51) at its centre has angular divisions marked around the circumference of a 7 or 8 inch circle (i.e. much like an oversized 368* protractor). A transparent ruler (52) is then pivoted at the pinhole (51) in the centre. When the tracing has been made it is placed over the board (50), the pivot pin (51) pressed through the central observation point (51). The tracing falls entirely inside the protractor markings, and the ruler is long enough to reach those markings. Radial movement of the transparent ruler (which has one central line (53) on it) over the tracing makes it very easy both to count the ridge-counts for each irregularity, to measure radial distances (these are marked on the ruler in the appropriate units), and to read off the angular orientations from the circumference of the inscribed circle. We shall record radial distances in units of 0.5m (or 0.5cm on the 10x enlargement) and round to the nearest integer. No greater accuracy is either required or useful. These distances then appear as integers in the range 0 to 50. The type code (T) is a hexadecimal integer, the angular orientation (θ) an integer in the range 0 - 360, and the ridge count (R) an integer in the range 0 to 50. The total storage space required for all four coordinates is therefore close to 3 bytes; to be precise, it is 25 bits.

14.1.3 'WRAP" AROUND' 360° SECTOR.

The sector to be recorded may be a small part of a fingerprint, in which case the area to be coded should be a sector as enclosed by two radial lines. However the normal assumption is that the whole of the visible fingerprint pattern will be coded. Such a sector can be enlarged at will by moving the radial boundary lines, until such time as the internal angle reaches 360°. At that stage the two boundary lines coincide and where they coincide will be called the cut. Provided our topological reconstruction algorithm can cope with the fact that, at the cut, some ridges effectively leave one end of the sector and reappear at the opposite end, then we can forget about the existance of the boundary lines altogether.

The reconstruction algorithm will need to be told how many ridges need to be connected up in th is way --- and that number

(which is the member of ridges that cross the cut) will be recorded as a part of the fingerprint data. It is convenient to specify that the cut will be vertically below the central observation point, and that the ridges which cross it be called moles (as they pass underneath the observation point).

The coordinate system can now be used to describe the complete topology of a whole fingerprint.

TOPOLOGICAL RECONSTRUCTION FROM COORDINATE SETS.

The method to be described here is certainly not the only way it could be done but this one does work very well, is probably as fast as any could be, and leads directly to the point at which no further work is required to be done in order to extract characteristic- centred vectors from the reconstruction. In fact all the characteristic-centred code vectors can be simply lifted out of the array formed by this method.

It will be noticed that the fourth coordinate (D) is ignored throughout this section as it plays no part in the reconstruction process. It is used in the comparison algorithms only after the topology has been restored.

Let us suppose that the print to be reconstructed has m moles and n topological irregularities, whose coordinates are the set (T_i, θ_i, R_i, D_i): i = 1,...n.

THE 'CONTINUITY' ARRAY.

This reconstruction method involves the systematic development of a large 3- dimensional array, which will be called the 'continuity' array (C) comprising elements c(i,j, k). To understand the function oi this array it is necessary, first, to examine figure 12: it shows a (simplified) fingerprint pattern with selected central observation point and the radial cut vertically downwards. A radial line from the central observation point is drawn marginally to the clockwise side of every topological irregularity in the picture (whether it be a true characteristic or not). If there are n irregularities (which we will call (I_l . . . l_n) , there are n + 1 radial lines in total (this includes the cut). Calling the cut line

andϊ numbering the lines consecutively in a clockwise direction gives, the set of lines

Now re-order the topological coordinate set by reference to the. second coordinate (θ) — so that the coordinate set satisfies the condition: — θ_i ≤θ_i+1 for all i ∈ {1,2, ... n - 1}

There are then simple i-1 mappings between the lines f^ehe irregularities

and

their coordinates ( T _i, θ_i, R_i, D_i) : i = 1 ... n).

Each of the lines intersect a certain

number- of ridges, giving an ordered sequence of ridge intersection points. Let the number of ridges crossed by line He: called r_i. Further, let the ridge intersection points on the line 1_i be called points {p(i,j): j = 1,...

point p(i, 1) being the closest to the central observation point and p(i, r. ) being the closest to the edge of the visible print.

The continuity array C is then set up with a direct correspondence between the ridge intersection points p(i,j) and the elements of C, namely c(i,j, k). k takes the values 1 to 4, and thus there is a A to 1 mapping of the elements:

{c(i,j,k): i =0,...n:j= 1,...r,:k= 1,2,3,4}

onto the set of ridge intersection points:

{p{i,j) : i = 0,... n : j = 1,...r_i}

The array C can therefore be used to record four separate pieces of information about each of the ridge

intersection points.* The meanings assigned to each element of C are as follows:

c(i,j, 1) --- "what is the first event that topological exploration from the point p(i,j) in an anticlockwise direction will discover?" c(i,j,2)--- "which of the irregularities l_l ... I_n is it that such anticlockwise* exploration will discover first?" c(i,j,3) — "what is the first event that topological exploration from the point p(ι,j) in a clockwise direction will discover?" c ( i , j , 4) --- "wh ich of the irregu l arit ies I_l . . . I_n

* The part of the matrix C which will be used for any one print is therefore irregular in its 2nd (j) dimension. is it that such clockwise exploration will discover first?" c(i,j, 1) and c(i,j, 3) should, therefore, be ridge-tracing event codes in the normal hexadecimal integer format (not to be confused with the different set of hexadecimal codes currently being used for the irregularity type (T_i)). c(i,j,2) and c(i,j,4) are integers in the range 1—n which serve as pointers to one of the coordinate sets. They are a kind of substitute for distance measures (being associated with c(i,j, 1) and c(i,j,3) respectively) but they act by referring to the coordinates of the irregularity found, rather than by giving an actual distance. They will be called irregularity indicators in the following few sections.

OPENING THE CONTINUITY ARRAY.

To begin with, the whole of the continuity array is empty (and, in practice, all the elements are set to -1). It will be filled out successively starting from the left hand edge ( i = θ) and working across to the right hand edge (i - n).

Starting with i = 0 (at the cut---, FIG. 12a) we know only that r_o = m (the number of ridges crossing the cut is the number of moles recorded in the data.) Nothing is known (yet) about any of these ridges. The first set of entries in the continuity array is made byassigning a dummynumber to every possible ridge exploration from the line 1_o .

The dummy numbers are integers in a range which cannot be

confused with real event-codes.* Each dummy number assigned is different, and the reconstruction algorithm views them thus:

"I do not yet know what happens along this ridge — I will find out later — meanwhile I need to be able to follow the path of this ridge segment, even before I find out where it ends."

This first step in filling in the continuity matrix is therefore to assign dummy numbers to each of the elements {c(0, j,k):j = 1,...r_o : k = 1 or 3}.

The elements {c(0,j, k) : j = 1,... r_o : k = 2 or 4} are left untouched for now.

ASSOCIATION, ENTRIES, AND DISCOVERIES IN THE CONTINUITY ARRAY.

The next stage is to consider each of the coordinate sets

* In practice dummy numbers start at 100 and, whenever another one is needed, the next free integer above 100 is used. Obviously a record is kept of how many different dummy numbers have been assignee. (T_i , θ_i, R_i, D_i) in turn starting with i = 1. We know that the irregularity I₁ is the only change in the laminar flow between lines l₀ and l₁. We also know its type (T₁) and its ridge-count (R₁) · Depending on the type T₁ there are various associations, entries and discoveries that can be made in the continuity array.

Suppose, for example, that T₁ = 3 (i.e. a ridge ends — according to the table of irregularity types). We can deduce that

r_i = r_o -1

(i.e. line l₁ crosses one less ridge than line l₀ and we can make the following associations in the second column (i = 1) of the continuity array. (Associations occur when one element of the array is set equal to another,)

c(1,y,1) = c(0,j,1) for all 1≤ j≤ R₁ - 1, c(1,j,3) = c(0,j,3) for all 1≤ j≤ R₁.

(i.e. ridges below the irregularity pass on unchanged) also:

e(1.j, 1) = c(0,j + 1, 1) for all R₁ + 2≤ j≤ r₁, c(1,j, 3) = c(0,j + 1, 3) for all R₁ + 1≤ j≤ r₁. (i.e. ridges above the irregularity pass on unchanged, but are displaced downwards by one ridge, due to the R ₁ + 1'th ridge coming to an end.)

Thus many of the dummy numbers from the (i = 0) column are copied into the (i = 1) column and their successive positions show which ridge intersection points lie on the same ridges.

Further information is gained from the immediate vicinity of the irregularity and this allows us to make entries in the array. (Entries result directly from the coordinate set being processed, rather than by copying from another part of the array) .

c(1,R₁, 1) = 8, c(1,R₁,2) = 1, c(1,R₁ + 1,1) = 6, c(1,R₁ + 1,2) = 1.

(i.e. the line 1, is drawn marginally past the ridge-ending I₁ , and so that ridge-ending appears as a facing ridge ending in anticlockwise exploration from ridge intersection points p(1,

R₁) and p(1,R₁ + 1). The event seen, in each case, is I₁ itself.) We also have discovered what happened to the ridge that passed through the point p(0, R₁ + 1): it ended (code 3) at irregularity I₁ . That discovery enables us to note the fact that the ridge exploration clockwise through point p(0, R₁ + 1) ended here. The existing entry in c(0, R₁ + 1,3) is a dummy number, and the new found meaning for that number is recorded in the dummy number index. Suppose the dummy entry had been the number 187: then we store its meaning thus:

index (107)= (3,1)

Eventually all the appearances of the number 107 in the array will be replaced by '3', and, at the same time, all the associated irregularity indicators will be set to' 1' .

Knowledge of T ₁ and R ₁ has therefore enabled us to make a particular set of associations, entries and discoveries — from which it has been possible to place something (either entries or dummy numbers) in all of the elements of the set:

{c(l,j,k) :j=1,2, ...r₁ : k = 1 or 3}

The process now begins again, with examination of irregularity I₂ followed by I₃ .... I_n . Each different possible type code T_i generates its own individual set of associations, entries and discoveries. Each set allows the next

column of C to be filled in. * It should be pointed out that whenever association is made of event codes (as distinct from dummy numbers) then association is also made of their respective irregularity identifiers.

After all the n coordinate sets have been processed (and entries thereby made in the whole of the continuity array) a few last associations need to be made in order to account for the fact that ridges cross the cut. These associations are that:---

c(0, j, 1) is equivalent to c(n,j, 1) for all 1≤ j≤ r₀, and c(n,j,3) is equivalent to c(0,j, 3) for all 1≤ j≤ r_o. (Of course r_o = r_n = m)

which effectively 'wrap around' the ends of the continuity array by sewing up the cut. As each of these elements of C already has some sort of entry in it, the mechanics of making these associations are more akin to the normal mechanics of discovery, in that they involve making entries in the dummy number index. They may, in fact, enter dummy numbers in the dummy number index thus indicating that two different dummy numbers are equivalent (i.e. they represent the same ridge exploration).

* Some of the entries may well be new (unassigned) dummy numbers. This occurs wherever new ridge segments start at the irregularity. It did not happen in. the case of the ridge ending. PROPERTIES OF THE COMPLETED CONTINUITY ARRAY.

Once this process is complete the continuity array will have acquired some very important properties:

(a) all the elements {c{i,j,k) : 0 ≤ i ≤ n : 1 ≤ j ≤ r_i : k = 1 or 3} contain either ridge exploration event codes (hexadecimal) or dummy numbers (integers over 188).

(b) wherever c(i,j_r1) or c(i,j,3) is an event code, then the corresponding entries, c(i,j,2) or c(i,j,4) respectively, will contain an irregularity identifying number that shows where that ridge event occurs.

(c) all the different appearances of a particular dummy number. in the continuity array reveal all the intersection points through which one continuous ridge exploration has passed. (Hence the name for the array. )

(d) a discovery has been made in respect of every dummy number that has been allocated, and there is, in the dummy number index, an equivalent event code and associated irregularity identifier waiting to be substituted for all the appearances of that dummy number. The dummy number index is therefore complete. This simply Must be the case as a discovery has been recorded every time that a ridge ran into an irregularity. There can be no ridge explorations that do not end at one, or other, of the n irregularities — consequently there can be no outstanding 'unsolved' ridge explorations by the time all n irregularities have been dealt with.

FINAL STAGE OF TOPOLOGICAL RECONSTRUCTION.

The final stage of the reconstruction process is to sweep right through the continuity matrix replacing all the dummy numbers with their corresponding event codes from the index. The, related irregularity identifiers are filled in at the same time, also from information held in the index. This second (and final) sweep through the elements of the continuity array leaves every element in the set:

(c(i,j, k):i=1...n:j=1...r :k=1or3)

as an event code, and every element of the set:

(c(i, j, k) : i = 1... n : J = 1. .. :k= 2or 4)

as an irregularity identifier.

For any particular line 1. the entries of C in the ith column correspond exactly to the elements of a topological code vector generated by that line. The only difference in appearance is that we have irregularity identifiers rather than distance measures to go with each exploration event code. The later vector comparison stages of the matching algorithm are adapted with that slight change in mind.

This completes a somewhat simplified account of a rather complex process. There are other complications which have not been explained in full — such as how the algorithm deals with sequences of dummy numbers that are all found to be equivalent, and the special treatment that ridge recurves have to receive, and how the algorithm copes with multiple irregularities showing the. same angular orientation. Nevertheless this explanation serves well to demonstrate the methodical and progressive nature of this particular reconstruction process. It also makes clear that only two sweeps through the matrix are required which is surprisingly economical considering the complexity of the operation.

THE MATCHING ALGORITHM LM6. (APPENDIX A, FIGURE 19).

The algorithm LM6 (Appendix A) accepts latent data in coordinate form, rather than by prepared vectors. Topological reconstruction was performed both on the latent mark (once only per search) and on each file print to be compared with it. The continuity matrix generated from the latent coordinate set will be called the search continuity array, and the continuity array generated from the file set will be the file continuity array.

There are two distinct phases of print comparison which take place after these topological reconstructions are complete. Firstly, the appropriate vector comparisons are performed and their scores recorded — secondly, the resulting scores are combined to give an overall total comparison score. The vector comparisons are essentially a way of comparing the topological neighborhoods of each of the characteristics seen on the fingerprints under comparison. The vectors correspond exactly to the topological code vectors (of the type described in connection with the vector matching algorithm MATCH4) that would be generated by each of the radial lines as shown in figure IS. There is one radial line per characteristic, therefore one extracted vector per characteristic. It is most important to realise that, according to the invention, the observation points selected on the two prints under comparison' do not need to have been in the same positions. The reconstructed topology will be the same no matter where it was viewed from. Just as two photographs of a house, from different places, look quite different — nevertheless the house is the same. The final comparison scores will be hardly affected by misplacement of the central observation point provided they lie in roughly the right region of the print. The reason for approximately correct placement being necessary is that the orientation of the imaginary radial lines, which effectively generate the vectors after reconstruction, will depend on the position of the central observation point. The effect of misplacing that point (in a comparison of mates) is to rotate each generating line about the characteristic on which it is based. Such rotation is not important provided it does not exceed 20 or 30 degrees. Slight misplacement of the observation point is not going to materially affect the orientation of these imaginary generating lines, except those based on characteristics which are very close to it. Specifying that the central observation point should be adjacent to the core (in the case of whorls or loops) and at the base of the ' upcurve' (in the case of plain arches) is a sufficiently accurate placement rule.

THE VECTOR COMPARISON STAGE.

From the search continuity array a vector is extracted for each true characteristic on the latent mark. Vectors are not extracted for the other irregularities ('ridges going out of sight', ' ridge recurves', etc.) If the latent mark shows 13 characteristics we then have 13 vectors, each vector based on an imaginary line drawn from the central observation point to one of those 13 characteristics, and passing marginally to the clockwise side of it. Let us now forget about all the other topological irregularities in the coordinate list and number the characteristics 1,2,3,... k. If the number of coordinate sets, in total, was n then certainly k ≤ = n. The extracted search vectors can now be called S_l ....S_k In a similar fashion the extracted file vectors, each based on true characteristics, can be called F_l .. . F_m .

For each search vector a subset of the file vectors is chosen for comparison. The selection is made on these bases:—

(a) that the characteristic on which the file vector is based must be of similar type (either an 'exact' match or a 'close' match) to the one on which the search vector is based.

(b) that the angular coordinates of the characteristic on which it is based must be within a permissible angular tolerance of the angular coordinate of the characteristic on which the search vector is based. The permissible angular tolerance is a parameter of the algorithm.

This selection essentially looks for file, print characteristics that are potential mates for the search print characteristics. The vector comparison that follows serves to compare their neighborhoods. It is quite obvious that allowing a wide angular tolerance significantly increases the number of vector comparisons that have to be performed. If a small angular tolerance is permitted then a badly misoriented latent mark may not have the mated vectors compared at all.

The vector comparison itself is much the same as used hitherto except that the vectors contain irregularity identifiers rather than distance measures. At the appropriate stages of the vector comparison subroutine the actual linear distance ('as the crow flies') from the central characteristic to the ridge-event is calculated by reference to the appro- priate coordinate sets. Thus ordinary spatial distances can be used, and a great degree of reliability can therefore be attached to thern.

For each search vector S_i , and candidate file vector

F_j , a vector comparison score q_{i j} , is obtained. For each search vector S_i a list of candidate file vectors, with their scores, can be recorded in the form of a list of pairs (J,q_{i j}) . There are typically between 5 and 15 such candidates for each search vector when the angular tolerance is set at 30°. These lists of candidates can then be collected together to form a table, which will be called the candidate minutia table. An example of such is shown below.

Each column is a list of candidates for the search vector labelled at the head of the column. In each case the first of a pair of numbers in parentheses shows which file minutia was a candidate, and the second number is the score obtained by its vector comparison.

FINAL SCORE FORMULATION.

We are now left with the problem of intelligently combining these individual candidate scores to give one overall score for the print. If the file print and latent mark are mates it would be nice to think that the highest candidate score in each column of the candidate minutia table indicated the correct matching characteristic on the file print. If that were the case then simply picking out the highest in each column, and adding them together, might serve well as a method of formulating an overall score. However that is not the case. Roughly 50% of true mated characteristics manage to come top (in score) of their column — the others usually come somewhere in the top five places.

THE NOTION OF ' COMPATIBILITY'.

We learnt from earlier experiments with latent entry by vectors that combination of scores was best done subject to conditions and, in that case, the condition was correct relative angular orientation. It will make sense, therefore, to combine the individual candidate scores when, and only when, they are compatible.

If (j,q_{l j}) is a candidate in the S_l column, and (i,q_2i ) is a candidate in the S₂ column — then there are various reasonable conditions that can be set in respect of these two candidates before we accept that they could both be correct. We will say that these two candidates are compatible if, and only if, these three conditions hold true:

(a) i is not eoual to J. (Obviously one file print characteristic cannot simultaneously be correctly matched to two different search print characteristics.)

(b) The distance (linear) between file print characteristics numbered i and J should be the same, within certain tolerance, as the distance between the two search print characteristics that they purport to match. That tolerance is an important programme parameter.

(c) The relative angular orientation of the file print characteristics should be roughly the same as the relative angular orientation of the two search print minutiae that they purport to match. The tolerance allowed, in this instance, is the same angular tolerance that was used earlier to limit the initial field of candidate minutiae.

SCORE COMBINATION BASED ON COMPATIBILITY.

The application of the notion of compatibility in formulating a total score was originally planned as follows:—

Step 1: Reorder the candidates in each column by reference to their scores, putting the highest score in each column in top place.

Step: 2: In each column, discard all the candidates that do not come in the top five places.

Step 3: For each remaining candidate check to see which candidates in the other column are compatible with it.

Step.4: Taking at most one candidate from each column, pick out the highest scoring mutually compatible set that can be found. A mutually compatible set is a set of candidates each pair of which are compatible.

Thus a set of file print characteristics is found, each of which has similar topological neighborhood to one of the latent mark characteristics (as shown by their high vector comparison scores) and whose spatial distribution is very similar to that of the latent mark cnaracteristics (as srtown by their compatibility). Spatial considerations are therefore being used in the combination of topological scores — as is already the case at a lower level, when distance measures are used in the vector comparison process. The algorithm LM5 was originally written to perform the steps described above. Unfortunately it overloaded when it tried to do the comparison of a very good latent with its mate! The reason for this is that the algorithm will examine every possible mutually compatible set in turn. Certainly non-mates have very few mutually compatible sets of any size. However, if a good quality latent gives a largest compatible set of s i ze N (i.e. N characteristics match up well with the file print) then there are 2 ^N - 1 subsets of that largest set, each of which will be a mutually compatible set. The total number of such sets is therefore at least 2 ^N, and probably much greater. In some cases N can be so large that the computer could not finish the job.

CANDIDATE PROMOTION SCHEMES.

The following method accomplishes much the same sort of candidate selection, but very much faster, and without requiring complete mutual compatibility in the selected set. The first three steps are the same as before:

1. Reorder the candidates in each column, by their scores.

2. Discard all candidates not ranked in the top 5 places in their column. 3. Check the compatibility of all remaining candidates with the remaining candidates in each other column.

The fourth step is calculation of what will be called a compatible score for each of the remaining candidates. Here are two possible alternative methods for doing this:---

(a) For each individual candidate add together all the scores of top-ranked candidates in other columns with which that candidate is compatible. Finally add the candidate's own score to the total.

(b) For each individual candidate find, in each other column, the highest scoring compatible candidate. Add together those scores (one from each column), and then add the target candidate's own score to the total.

On the basis of these compatible scores, rather than on the original vector comparison scores, reorder the remaining candidates in each column.

This 4th step can be regarded as a promotion system based on compatibility with other high-ranking candidates. The difference between options (a) and (b) is this: in rule (a) promotion depends on a candidate's compatibility with those already in top place (and could be called a 'bureaucratic' promotion system). With rule (b) a whole group of candidates in different columns, none of whom are in top place can all be promoted to the top at once by virtue of their strong compatibility with each other (a 'revolutionary' promotion system). Both were tried and the 'revolutionary' system was found to be the most effective. It is the 'revolutionary' rule (b) above that is used in the algorithm LM6.

The promotion stage could be repeated several times if it was considered desirable (to give the top set time to 'settle') — in practice it was found that one application was sufficient. Mate scores improved very little, if at all, when second and third stages of promotion were introduced.

After the promotion stage is complete all but the top ranked candidates in each column are discarded, and the compatible score for the remaining candidate in each column is then recalculated on the basis of only the; other remaining candidates.

The final score is then evaluated by adding together all of these new compatible scores that exceed a given threshold. That threshold is a programme parameter, and is expressed as a percentage of the 'perfect' latent self-mated score. The use of these compatible scores, rather than the original vector comparison scores, in evaluating the final score has the effect of multiplying each original vector score by the number of other selected (i.e. now top-ranked) candidates with which it is compatible. The more dense the compatabilities of the final candidate selection, the higher the score will be.

PERFORMANCE OF LM6.

A series of tests was then performed using the algorithm LM6. The best test results obtained gave the following rankings:

Mates ranked m 1st place. 80.36%

Mates ranked in 1st-3rd. 82.14%

Mates ranked 1st-10th 85.71%

These indicate a vast improvement over the performance of traditional prior art spatial methods. Some of the parameter values that gave the above results:

(a) Exact match scores were set to be 5, with close match scores (CMS) set to be 3. Thus close match scores were given a higher relative weighting tnan previously used in the comparison of rolled impressions (where the optimum ratio had been 5:1) The higher weighting can be attributed to a higher incidence of topological mutation in the interpretation of latent marks.

(b) The distance tolerances were set at 10% (of the distance being checked) with a minimum of 1. The same distance tolerances were used in the vector comparison stage of the algorithm and in the score combination stages (where correct relative distance was one of the three conditions that needed to be satisfied for two file print minutiae to be compatible.)

(c) The ridge span used in vector comparison was 10 ridges — this means that vectors of a standard length of 40 digits, with 40 associated irregularity indicators, were used whenever vector comparisons were performed. Ine results were no worse with longer vectors, but the smaller value for SPAN gave faster comparison times on a serial machine.

(d) The minimum angular tolerance (MAT) was 20°. This is almost inconsequential as the true angular misorientation limits were set individually for each latent mark ( by subjective judgement) and written as a part of the latent search data.

(e) The candidate minutia selection depth ('DEPTH') was 5 throughout. This means that, for each search minutia, only the top 5 candidate file print minutia would be considered. This parameter was set to 5 as a result of observation, rather than experiment.

(f) The cornDatible score cutoff point ('CUTOFF') is the percentage of the latent Mark's perfect self-mated score that must be attained by the final compatiole score of a candidate file print minutia before it will be allowed to contribute to the final total score. The best value for this parameter was found to be 15%, which is surprisingly high. The effect of this setting was to ensure that the vast majority of file print minutiae that were not true mates for search minutia contributed nothing to the score? the net effect of this was to make most of the mismatch comparison scores zero. In fact, for 28.6% of the latents used, the true mate was the only file print to score at all — the other 99 file prints all scoring zero. Of course such a stringent setting also made things tougn for the mates, as shown by the fact that 7% of the mate scores were zero also. However, these 7% were mates that had not made the top ten places in any of the tests, and were therefore most unlikely to be identified anyway. It is also worth pointing out that on each occasion when one file print alone scored more than zero (i.e. exactly 99 out of the 100 in the file collection scored zero) that one was the true mate. (These are the 28.6% mentioned above.) This represents a surprisingly hign level of what might reasonably be termed ' doubt-free identifications'. COMPUTATION TIMES.

The foregoing description of the algorithm LM5 will have made it quite clear that this is not, in its present form, a particularly fast comparison algorithm but using the principles set forth herein, it can be significantly improved. The CPU time taken on a VAX 11/780 for the above test (5600 comparisons) was 12 hours and 1 1 minutes. That means an average CPU t i me per comparison of 7.8 seconds — which is a somewhat disconcerting figure when the acceptable matchinq speeds for large collections are i n t he order of 500 compari sons per second.

However 7.8 seconds per comparison is not quite so alarming when one considers tne extensive and multi-layered parallelism of the algorithm. At the lowest level, the vector compari sons themsel ves are sequences of array operat i ons. At the next level, many vector comparisons are done per print comparison. In the score combination stages calculations of compatibility and compatible scores are all simple operations repeated many many times. There is, in this algorithm, enormous scope for beneficial employment; of modern parallel processing techniques. It is harσly appropriate to take too much notice of the CPU time in any serial computer where each operation is done element by element.

Moreover, in the area of latent searching, the primary area of concern for law enforcement agencies is shifting from the issue of speed onto the issue of accuracy. It is quite reasonable to obtain the necessary speed through 'hardwiring' (with its asisociated cost) for the sake of matching algorithms that will actually make a substantial number of identifications from latent marks.

FILE STORAGE SPACE DEFAULTING THE 'EDGE TOPOLOGY'.

It is noticeable that the need to include all topological irregularities, rather than just the true characteristics, significantly enlarges the volume of the file print data. In the 100 file cards in the experimental database the average number of irregularities recorded per print was 101.35. The majority of irregularities that were not true characteristics fell at the edge of the print; they recorded all those places where ridges' came into sight' or 'went out of sight'. Thus a significant proportion of the file data storage requirement is spent in descricing the edge of the file print.

ln practice the edge of the file print is not very important — as the latent mark invariably shows an area completely within the area of the rolled file print. The edge consequently plays little or no part in the print comparison process, and the edge description serves only to help the topological reconstruction process make sense of the ridge pattern.

For the sake of economy in file size, therefore, the algorithm LM6 was prepared by adapting the reconstruction stage of LM5 slightly. It is adapted in such a way that the reconstruction will invent its own edge topology in the absence of an edge description. The default topology selected is not important; it is only important that the algorithm does something to tie up all the loose rιdges around the edge.

The file collection was then pruned substantially by elimination of all of the edge descriptions, and this reduced th average number of coordinate sets per print from 101.35 to 71.35.

The test reported above was then rerun using the algorithm LM6 and the concensed file set. The rankings obtained were exactly the saroe as before — so a saving of 30% in file data storage was achieved with absolutely no loss of resolution.*

* The pruning operation was not performed on the latent mark data file for two reasons. Firstly, latent mark databases (where these are kept) are tiny in comparison to rolled file print collections, and so storage requirements are not a major concern. Secondly, the edge of a latent mark does play an important part in the comparison process. OPTIONAL USE OF FIFTH COORDINATE.

Many existing spatial matching algorithms use local ridge direction data as well as X and Y coordinates for each characteristic located. Thus spatial marching algorithms normally use coordinates of the form (X, Y, θ) where θ shews the direction of: ridge flow local to each chacteristic.

Such data was not used in the algorithm LM6 - but could well be incorporated into the topological coordinate data as a fifth coordinate. The use of that fifth coordinate with the matching algorithm could then bei-

a) as a further means of restricting the selection of characteristics on a file print that could be considered as candidates for matching a particular search print characteristic,

b) as a further means for establish "compatability" in the score combination stages of the algorithm, and

c) as a corrective measure for rotational misorientation.

The benefits of including sucn a fifth coordinate may not justify the 25 % increase in storage space that it would neces¬

sarily entail. DERIVAT: ON OF VECTORS FOR ROLLED PRINT COMPARISON.

The ability to perform topological reconstruction from a set of coordinates has some rather interesting ' by-products'. The first of these relates to the fast comparison of rolled prints on the basis of a single vector.

As the data format for a latent mark and a rolled impression is now identical, it would be possible to use the latent matching algorithm (LM6) to compare one rolled print with another. (One of the rolled prints would be acting as a very high quality latent.) however, to use the algorithm LM6 in this way on rolled prints would be 'taking a sledge hammer to crack a nut'. One single vector comparison deals with comparison of two rolled prints perfectly adequately — so it would not be useful to use this latent matching algorithm, with its hundreds of vector comparisons, in this application.

Nevertheless there is a significant benefit to be gained from the topological re- construction section of the latent matching algorithm, The pata-gathering requirements included the need to track along ridges, in order to find the first event that happened. Although that, in itself, is not a particularly demanding programming task the ability to reconstruct topologies from coordinates renders it unnecessary. A topological code vector representing a horizontal line passing through the core of a loop can be lifted out of the continuity matrix after reconstruction. The left half of it (i.e. the part that falls to the left of the core) and the right half will be extracted separately. Each half is extracted by selecting the column of the continuity matrix that corresponds with an imaginary line just to the anticlockwise side of horizontal, (i.e. just below for the left side, and just above for the right side). Amalgamating these two halves, reversing the ' up' and' down' pairs from the right half, gives a single long vector of the required format.

There will be two minor differences between these extracted vectors and the design originals:—

(a) the core point, which was to be on a ridge, is replaced by the central observation point which is in a valley. The central observation point will, however, be only fractionally removed from the core in the case of loops and whorls.

(b) the vector has irregularity identifiers rather than ridge-traced distance measures. Consequently the vector comparison algorithm has to be adapted to refer to the appropriate coordinctte sets when the time comes to apply the various distance tests.

In an operational system the maximum speed would be obtained by performing topological reconstruction, and vector extraction, at the time each print is introduced to the collection. The extracted 'long' vectors could be stored in a separate file so that they could be used for fast vector comparison without the need to perform topological reconstruction each time. That would obviously increase the data storage requirement per print by the 60 bytes required for such 'long" vectors. The coordinate sets, and topological reconstruction would then only be used when a latent search was being conducted.

If the derived long vectors were to be made completely independent of the coordinate sets, it would be necessary to replace the irregularity identifiers with calculated linear distances at the time of vector extraction.*

* The performance of MATCH4 on such perived vectors has not been tested. This is because of the incredibly time consuming nature of manual encoding according to the latent scheme (up to 1 hour per print for clear rolled impressions). The time for such tests will be after the development of automatic data extraction techniques, when large numbers of prints can be encoded automatically according to the latent scheme, and then have derived vectors extracted after topological reconstruction. IMAGE RETRIEVAL SYSTEM.

There is a significant demand for automated identification systems to be linked with an image-retrieval facility for all the prints in the file collection. The system operator obtains a list of the highest scoring candidates each time an automated search is conducted — these candidates have then to be checked visually by the fingerprint expert to determine which of them, if any, is the true mate. This visual checking can be done much more easily if the fingerprints can be displayed on a screen, rather than having to be fetched from a cupboard. Much research is currently underway with the aim of finding economical methods for storing the two dimensional pictures (fingerprints) in computer memory so that they can be called up and displayed on the terminal screen.

There are two distinct paths for such research. The first aims to record the original grey-scale data which is output from automatic scanners, with no interpretative algorithms ever being applied to the prim (although data compaction techniques will, of course, be used). The second uses interpretative algorithms to identify the ridges and valleys within the grey-scale image, to resolve the picture into a binary (black and white) image, and then finally to reduce the thickness of each ridge to one pixel by a variety of ridge-thinning techniques. What is then stored is sufficient data to enable each thinned ridge segment to be redrawn (i.e. start position, end position, curvature etc.). The data requirements per print are in the order of 2,000 to 4,000 bytes for compressed grey-scale images, and between 1,000 and 2, 000 bytes for a thinned image.

With the 4-coordirate system used in the latent scheme records, a complete topological and spatial description of the characteristics can be stored in between 300 and 400 bytes. It should therefore be possible to redraw the fingerprint, in the style of a thinned image, from that data. Firstly topological reconstruction has to be performed, and then the elastic (topological) image has to be 'pinned down' at each characteristic, by reference to their polar coordinate positions contained in the coordinate sets.

The substantial problem in such a process is the business of generating a smooth ridge pattern that accommodates all the pinned points. The problems raised are similar to those in cartography — when a smooth contour map has to be drawn from a finite grid of discrete height (or depth) samplings. One fairly crude reconstruction algorithm was written simply because generation of a picture from topological coordinate sets provides a most satisfying demonstration of the sufficiency of such coordinate descriptions. The algorithm PLOT1 (Appendix C) was written as a Fortran programmes, its input was the set of coordinates representing a specified print, and its output was a file of 'QMS-QUIC' instructions for the graphics display facility of a QMS LASERSRAFIX 1200 printer. The algorithm first performed topological reconstruction in the normal manner, and then assigned polar coordinates to every ridge intersection point in such a manner that all the topological irregularities were assigned their own (real) polar coordinates, ft series of simple linear smoothing operations are applied, coupled with untangling and gap-filling procedures that make successive small adjustments to the radial distances of all the intersection points that are not irregularities. These processes continue until a certain standard of smoothness is attained. Finally the picture is output as a collection of straight line segments between connected ridge intersection points.

A sample reconstructed fingerprint image is shown in figure 13, together with its descriptive data. The picture is made up of 4, 404 straight line segments. The topology is correct, and each irregularity is properly located, however the intervening ridge paths have suffered some unfortunate spatial distortions. For the sake of comparison, the original print tracing from which the coordinate sets were derived is shown in figure 14 (it has been reduced from 10x to 5x magnification). Detailed comparison of figures 13 and 14 will reveal a few places where the topology appears to have been altered. In fact it has not been altered — but, at this magnification, some ridges appear to have touched when they should not. This tends to occur where the ridge flow direction is close to radial. In such places the untangling subroutine, which moves ridges apart when they get too close together, has not been forceful enough in separating them.

Figures 15 and 16 show the tracing of a latent mark, together with its reconstructed picture. In this case the latent data comprised 32 coordinate sets (filling approximately 100 bytes), of which 21 make up the edge-description. There are ten genuine characteristics shown, and the remaining topological irregularity is the ridge recurve close to the core. The reconstructed image is made up from 780 straight line segments.

The facility for reconstruction also affords the opportunity to actually see a' default edge-topology'. Figures 17 and 18 show two further reconstructed images of the print in figure 14. The upper picture is the same as figure 13, except for a reduction in magnification (to 2.5x). The lower picture is a reconstruction from the condensed data set for the same print, after all the coordinate sets relating to ridges going 'out of sight' have been deleted, fill the loose ends have been tied up by the reconstruction algorithm in a fairly arbitrary, but interesting, way. The lower picture ooes, of course, show some false ridge structure in areas that were 'out of sight'. However the data storage requirement for the corresponding coordinate sets was only 354 bytes for the edge-free description, as opposed to 526 bytes for the original description.

From these figures it is fairly clear that more sophisticated smoothing techniques can be applied before really reliable images can be retrieved. These pictures are quite sufficient nevertheless to demonstrate the potential for such a scheme. They are also a fine demonstration of the effectiveness and accuracy of the topological reconstruction algorithms.*

A topological approach to fingerprint coding according to the invention offers a great deal in terms of improved accuracy andi cost-effectiveness. It is also clear tnat topology based matching algorithms are greatly improved by utilizing some spatial information. The power of resolution between mates and non-mates given by the combination of topological and spatial information is vastly superior to that which can be obtained by use of spatial information alone.

* It should be remembered that the path of the ridges plays no part in the comparison algorithms LM5 and LM6; only the topology, and the positions of the characteristics are used. The defects in these pictures are not, therefore, a reflection of defects in the latent searching algorithms. The greatest benefit that has been obtained is accuracy. With rolled impressions there is also a clear increase in speed and a massive reduction in storage requirement. With the latent searching scheme the question of speed has to be left open until the benefits of LM6' s extensive parallelism have been realized.

What is claimed is:—

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)

(51) International Patent Classification 4 (11) International Publication Number: WO 87/ 012 G06K 9/00 Al (43) International Publication Date: 26 February 1987 (26.02.8

(21) International Application Number: PCT/US86/01653 (74) Common Representative: ZEGEER, Jim; Suite 108, 8 North Pitt Street, Alexandria, VA 22314 (US).

(22) International Filing Date: 13 August 1986 (13.08.86)

(81) Designated States: AT (European patent), AU, BE (

(31) Priority Application Numbers: 766,331 ropean patent), BR, CH (European patent), DE ( 771,188 ropean patent), FR (European patent), GB (Eu 875,023 pean patent), IT (European patent), JP, KR, LU ( ropean patent), NL (European patent), SE (Europe

(32) Priority Dates: 16 August 1985 (16.08.85) patent).

3 September 1985 (03.09.85)

16 June 1986 (16.06.86)

Published

(33) Priority Country: US With international search report.

(71) Applicant: ZEGEER, Jim [US/US]; 801 North Pitt Street, Alexandria, VA 22314 (US).

(71X72) Applicant and Inventor: SPARROW, Malcolm, K. [GB/GB]; 127 Linton Road, Loose, Maidstone, Kent ME15 0AL (GB).

(54) Title: FINGERPRINT RECOGNITION AND RETRIEVAL SYSTEM

(57) Abstract

Fingerprints are scanned by a scanning system (13). Topological systems for coding and comparing fingerprints including a system for recording a description of fingerprints. In a preferred embodiment, a central point of the fingerprint is selected as a center of rotating scan line. The scan line is rotated to different topological characteristics. A code (T) representing the type of irregularity is recorded (16, 17). A measure (M) of the scanning position when encountering the irregularity is made (16, 17). In the case of a rotating scan line the angular coordinate (θ) is recorded. The ridge count (R) (16) is also recorded. A list of coordinate sets (T, θ, R) specifies the topology of a sector. For fragmentary prints similar coordinate sets are generated. A fourth coordinate can be added corresponding to the radial distance (D) (16). A full set of coordinates (T, θ, R, D) gives a complete topological and spatial description of a fingerprint. Comparison of fingerprints can then be conducted by a computer (20).

FOR THE PURPOSES OF INFORMAHON ONLY

Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT.

AT Austria GA Gabon MR Mauritania

AU Australia GB United Kingdom MW Malawi

BB Barbados HU Hungary NL Netherlands

BE Belgium rr Italy NO Norway

BG Bulgaria P Japan RO Romania

BR Brazil KP Democratic People's Republic SD Sudan

CF Central African Republic of Korea SE Sweden

CG Congo KR Republic of Korea SN Senegal

CH Switzerland LI Liechtenstein SU Soviet Union

CM Cameroon LK Sri Lanka TD Chad

DE Germany, Federal Republic of LU Luxembourg TG Togo

DK Denmark MC Monaco US United States of America

FI Finland MG Madagascar

FR France ML Mali

FINGERPRINT RECOGNITION AND RETRIEVAL SYSTEM

Background and Brief Description of the Invention

The invention relates to fingerprint coding and recognition and retrieval systems based on generally invarying topological irregularities, characteristics or minutiae (wh i ch terms are used interchangeably herein) of fingerprints. The term "fingerprint" or "print" is used in reference to the epidermal ridge lines of the ten fingers of the human hand, palm prints, toe and sole prints of humans wherein such epidermal ridge lines and characteristic features thereof are in patterns unique to a particular individual.

In my paper entitled "Digital Coding of Single Fingerprints — A New Approach for the Computer Age", Journal of Police Science and Administration, Vol.X. No.2, June 1982, I show that the soft elastic nature of human skin causes substantial variation of the spatial descriptions of successive impressions of the same fingerprint. Consequently, spatially based coding schemes used for forming machine searchable databases have inherent inaccuracies due to the fact that the spatial based coordinate system typically used for coding purposes could not take into account the wide variations in spatial distortions making the match or identification between two rolled prints on the same finger somewhat problematical particularly where the pri nt s are t aken at subst ant ial ly d i fferent t i mes or pressures.

Topological coding schemes provide concise digital codes that provide a more economical and more reliable basis for ten print identification systems. In my above referred to paper, I suggest comparison methods based on topological coding of prints in which a topology base coding system for recording and comparing minutiae used vector arrays generated from topologically based coding of fingerprints.

According to this invention, each fingerprint is scanned by a scanning system which typically includes a scanning 'line' which sweeps in a predetermined manner, such as horizontally, vertically or radially, from a prescribed origin for the scanning system utilized. When the scanning line moves over an irregularity (such as a ridge ending, bifurcation, etc.), the irregularity is recorded by the use of at least three coordinates: a type code (T) to particularly identify the irregularity, a measure (M) of the scanning line position when it hits the irregularity, and a ridqe count (R) which is the number of ridges intersecting the scanning line, at that position, between the irregularity and a prescribed point on, or origin for, the scanning line. A collection of coordinates sets (T, M, R) uniquely specifies the topology of a fingerprint or any part thereof.

Thus, the present invention provides a system for recording a complete topological description of a fingerprint subject to the constraint that each characteristic be in a given database, recorded once and only once. To form a library or database of topological coordinate sets for search purposes, rolled or file prints or so-called ten-print cards, are utilized and a central point on the fingerprint, (such as a core) is selected as a center of a rotating ridge scan line. When the scan line is a rotating ridge scan line, the rotating ridge scan line, which preferably has a center of rotation or origin which is just off of any ridge, is relatively rotated in a predetermined scan direction, clockwise, for example, to different topological characteristics (sometimes called irregularities or minutiae) of the fingerprint for a plurality of ridge lines. A hexadecimal oigital code representing the type (T) of irregularity (ridge-ending, bifurcation, etc.) and the angular coordinate (θ) (which corresponds to the Measure (M) of scanning line position) of the irregularity is recorded. In this case, the angular coordinate (θ ) is sufficient to specify the order in which the irregularities are passed over by the sweeping or rotating ridge scan line. The ridge count (R) between the characteristic or the irregularity and the central observation point specifies the ridge on which the irregularity occurs. Thus, a list of coordinate sets of the form (T, θ, R) specifies the topology of any sector uniquely. A fourth coordinate is added to the coordinate set to correspond to the radial distance (D) measured from the central observation point. D and θ then specify the positions of the characteristics in space and the full coordinate set (T, θ, R, D) gives a complete topological and spatial description of a fingerprint which only requires 4 bytes per irregularity.

Prints such as latent prints found at the scene of a crime ( SOC) are coded accord i ng t o the same t opolog i cal coordinate scheme. Computerised searching of such a latent mark against a large established collection of file prints is then performed through reconstruction of the topology local to each characteristic followed by comparison of such localized topology.

Topological vector extraction is based in part on the system disclosed in my above referred to paper. The core of the fingerprint is central y located at a reference point and a horizontal line is projected through the core to intersect ridge lines to each side of the centrally located reference point. In the case of an arch a vertical line is drawn through successive ridge summits. From the points of crossing of the ridges with the projected horizontal or vertical line, the ridges ar e traced to the first significant irregularity and a type code (T) is assigned to the irregularity and with the distance (D) from the reference line and these data are recorded in a predetermined order to constitute a topological vector for the print which then is recorded in a machine searchable database. Comparison of vectors take the form of a sequence of array operations. Comparison of good quality rolled prints is performed extremely rapidly on this basis.

Brief Description of the Drawings

The above and other objects, advantages and features of the invention will be more apparent when considered with the following specification and accompanying drawings wherein:

Figure 2(a) is a schematic block diagram of a second form of topological coordinate extractor (which is semi-automatic) incorporating the invention; Figure 2(b) is a block diagram of an extractor of a second form of topological coordinates, where a different scanning pattern is being used,

Figure 4 is a further schematic block diagram of a fingerprint recognition and retrieval system incorporating the invention, with a remote enquiry station.

Figure 5a illustrates a device for manually reading topological coordinate sets and Figure 5b illustrates the device of Figure 5a in use with a print therein,

Figure 6 shows the line placing for vector extraction from a plain arch.

Figure 7 illustrates the ridge exploration event codes. for use with vector extraction,

Figure 8 shows the horizontal line placed on the ulnar loop on a ridge tracing for use with vector extraction.

Figure 9 illustrates the 82 digit vector generated or extracted from the ridge tracing of Figure 8. Figure 10a is a pair of sample latent marks (approximately 5x) and, Figure 10b are examples of a latent mark (left) and its matching file print (right) the numbers are to corresponding features.

Figure 11 generally illustrates the sweeping or scanning line coding scheme,

Figure 12(a) shows the radial irregularity centered lines with the "cut" vertically below the observation point, Figure 12(b) shows a horizontal scan line which is moved relatively vertically over the print, Fiqure 12(c) shows a vertical scan line which is moved relatively horizontally over the print,

Fiqure 1A is a copy of the original fingerprint tracing corresponding to the print used in Fig. 13,

Detailed Description of the Invention

Figure 1(a) is a functional block diagram of a fingerprint recognition and retrieval system. File prints (which are rolled prints from a fingerprint card base, such as national, regional or local fingerprint files) are fed i n stacks to a high speed card handler 10 which may incorporate a card identification number reader 11, which may be an optical character reading device for reading the card identification number as well as reading such number or numbers printed on the card used to identify the subject or person whose fingerprints appear on the card. This card number can later be used by the central computer to associate the fingerprint data with the descriptive data (name, address, type of offense, etc. or other data related to the subject) that would be used in limiting the field of search.

The fingerprints on the cards are passed through a scanner 12 which senses each of the ten fingerprints on the card and outputs a gray-scale point matrix. Scanner 12 can be one of many types such as a "flying spot scanner" or a solid-state scanner or camera. It examines a series of small areas (pixels) of the fingerprint in turn and, as it encounters white (uninked), black (inked) or gray (partially inked) areas it produces a signal representing the blackness of each pixel. Thus, an array of such signals is formed representing a series of discrete samples covering the whole print area. In the art, this array is referred to as the "gray scale" image and each of the ten prints on a card are scanned in turn. The output from the qray scale scanner 12 is supplied to an image enhancer 14 which, likewise, is conventional in the art. Image enhancer 14 receives gray scale scanner output and forms it into a binary enhanced (black/white) image. In doing so, it compensates for variations in ink density over various portions of the print. The image enhancement process locates ridges and valleys and forms a binary image. Systems for determining whether each pixel is on a ridge (black) or in a valley (white) by reference to an examination of the apparent ridge flow direction in its vicinity and location of those apparent ridges, is well known in tne art. In addition to formation of a binary black/white image, the image enhancement processor 14 also determines which parts of the print cannot be interpreted as ridge/valley structure (i.e. they are "unclear") and which parts display a corrupted (scarred) structure. In addition, it records and outputs the locations and extents of such areas. It can also output ridge direction data which shows the approximate direction of the general ridge flow at each point.

The output from the image enhancer 14 is supplied to a topological coordinate extractor 16 and a vector extractor 17. The topological coordinate extractor 16 determines from the ridge flow data (using existing techniques such as are disclosed in U.S. Patents 3, 560, 928 and 4,156,230) whether or nor a central core exists in each pattern and will locate a position for a central observation point either close to the core (if there is one) or beneath the first upcurving ridge (in the case of an arch, which is the only pattern that does not have at least one core). If there is more than one core, then the one facing upwards will be selected. Having determined such a central observation point, then it will generate a set of coordinates of the form (T, M, R, D) for each irregularity in the binary image output from the image enhancement processor 14. Locations where ridges run into, or out of, "unclear" or "scarred" areas will be similarly recorded. The coordinate sets are as described later hereinafter. Figures 2(a), 2(b), 3, 4, 5, 12(a), 12(b) and 12(c) illustrate manually operated topological coordinate and vector extractors and will be described in detail hereafter.

The vector extractor 17 generates a topological code vector, of a length 62 to 82 digits, (as described more fully hereafter) together with associated distance measures. The imaginary generating lines are placed as shown in Figure 8 if the pattern has a central core and as shown m Figure 6 if it does not. The presence or absence of such a core is determined from the ridge direction data. A manually operated vector extraction system is disclosed in Figure 3 and will be described more fully hereafter. The data from the topological coordinator extractor and/or the vector extractor are supplied to a general purpose digital computer 20 which stores the topological coordinate sets extracted by topological coordinate extractor 16 in a mass storage system such as a disc storage 21. The topological code vectors for each print extracted by vector extractor 17 is likewise supplied to the general purpose computer which stores this data in a further mass storage medium such as disc storage unit 22. These data storage devices hold the coordinate sets and extracted vectors in association with the card identifying numbers so that when being searched and a match is made, the card identifying number is associated with the topological coordinate set in storage unit 22.

One or more ten print search inquiry terminals 25 are provided so that an operator can access the central computer 20 and instruct it as to the extent and nature of a search required (i.e. restriction by reference to other descriptive data such as offense type, age, sex, race, or geographical data which may be pertinent to the inquiry). The terminal 25 incorporates a fine graphic display facility sufficient to show the operator any fingerprint reconstructions which the operator may request and which will be outputted from the image retrieval processor as described more fully hereafter. Latent marks such as disclosed in

Figures 10a and 10b are first enlarded and a manual tracing is made at block 26. This unit provides the manual tracings for the latent inquiry terminal 29 which enables the latent examiner to enter the coordinate sets he has read manually and to initiate a search of the database 22. It incorporates a fine graphic display facility sufficient to show the operator any fingerprint reconstruction, such as shown in Figures 17 and 18 which the operator may request and which will be the output from the image retrieval processor.

The general purpose computer 20 handles the receipt and storage of all incoming data, administration of the databases, and performs searches of the databases either by passing coordinate sets to the latent matcher system 30 or by passing the extracted vectors to the vector matcher 31. In the case of ten-print inquiries, computer 20 will determine how many of the ten-prints available on the search card should be used (according to the priority or importance of the search) and combines the separate finger scores outputs from the vector matcher to give an overall score for each candidate file print. This computer 20 also displays results in the form of a list of top rank candidates from the database to the inquiry terminals 25 or 28 upon completion of a search. If a request is made for a fingerprint image reconstruction, the appropriate coordinate set is read from the data file and passed to the image-retrieval processor 30. The image-retrieval processor can be a parallel implementation of the image reconstruction program (PLOT1) set forth in appendix C attached hereto. Its input will be a coordinate set which is passed to it by the central computer 20 when the request is made from an inquiry terminal 25 or 29. The output is a line segment picture (figures 17 and 18) representing a reconstructed image of the fingerprint in question and it is displayed by the graphic facility of the terminal where the request is made. This program contains a sub-routine called "continuity" which effects the reconstruction of the topology of the print; it also contains three sub-routines called "smooth", "untangle" and "gap-fill" which perform linear smoothing operations by adjusting the paths of the ridges between characteristics. Other smoothing algorithms (such as the use of spline interpolation techniques) may be substituted for these three sub-routines.

The latent matcher 30 performs comparisons of two sets of coordinates sent to it by the central computer 20. It is preferably a parallel implementation of the latent matcher 30 which is set forth in appendix A. It returns a score to the central computer 20 which is a percentage in the range of 0 to 100 reflecting the similarity of the two prints represented by the two coordinate sets. The vector matching unit 31 compares vectors as directed by central computer 20 when a ten print search is underway and it uses the program attached hereto and identified as appendix B. The score returned is a percentage (a real number in the range 0 to 100) which reflects the similarity of the two prints represented by the two vectors.

IMAGE RETRIEVAL

In Fig. 1b, the gray scale point matrix from scanner 13 is fed via one side of selection switch SSW to compression processor 20CP and then stored as a computer compressed image in optical disk storage unit ODS to form a library of compressed print images. When it is desired to display a particular image, the image for any given print is retrieved from storage unit ODS for display at remote terminal 25. Instead of compressing a gray scale image, selection switch SSW can connect a skeletonized image to compression processor 20CP which is then stored in optical disk storage unit ODS. Image retrieval requires a decompression by decompression processor 20DP and the resulting image is displayed on the display at remote or local terminal 25. Finally, as will be cescribed in detail later herein, topological reconstruction of the image for display can be performed and this requires significantly less storage space to produce fingerprint images at remote terminals with relatively small amounts of data transmission.

There follow detailed descriptions of the major sections of the invention, namely:

(2) Details of the algorithm MATCH4 (Aopendix B) which is a series of array operations for comparing such vectors.

(4) Details of the process of topological reconstruction from such coordinate sets, and the latent matching algorithm LM6.

The coding is in two parts — 1. Coding the tooology. 2. Measuring the associated distance.

1. CODING" THE TOPOLOGY.

(a) LOOPS: By lookinq at the whole available print, and with particular reference to the first flexion crease and the directions of ridges which run close to.it, estimate a 'horizontal' orientation for a straight line. ('Horizontal' means parallel to the apoarent direction of the flexion crease.) Place a horizontal line through the loop core-centre, using the conventional rules for precise location of the core-point. (See figure 8.)

6. The line starts at the lowest visible ridge above the flexion crease and follows the 'summit' route to the top of the available picture. (c) WHORLS & OTHER TYPES: Locate a 'core' using an adaptation of the rules for looos, and place a horizontal line as for a loop.

Each point of intersection gives two 'directions' for topological exploration of that ridge: imagining oneself (just for a moment) to be a tiny insect' capable of walking along a ridge' — then one could walk each ridge in each of two direct ions from the point of i nt ersect ion. We st i pul ate that the walking (or exploration) will cease as soon as one of a number of specific 'events' is found.

Figure 7 shows the digital codes selected to correspond to possible ridge-exploration events. In each case the ridge being explored is marked with an arrow to show the direction of the exploration. The digital codes take the form of hexadecimal integers, and are always processed as such. Storage space required for each one is therefore only 4 bits, making it possible to compress one pair of digits into one byte. Not all 16 hex-digits are used; 1, 9, D and E being 'spare'. ' F' is used for padding the vectors up to a certain length for storage m a standardized data format.

TOPOLOSICAL VECTOR EXTRACTOR (FIGS. 3, 6, 8 AND 9)

The topological vector extractor shown in Fig. 3 includes terminal 200 having a display 201 and a control console or 'mouse' 202 connected to terminal 200 by cable 202C. A fingerprint (or portion thereof) 203 is shown having a horizontal line 205 positionable by a thumb wheel control 206 to pass through a central portion CP of the print 203 and cross ridges in the print to the left and right of the central portion CP. At each point where the line 205 intersects or crosses a ridge, the ridge is tracked in both directions to where an irregularity is encountered. A track ball 287 is used to locate a cursor on the origin CP, and the cursor is moved from the line 205 along each ridge by the track ball 202. A start distance measure switch 208S is operated to indicate the start of distance measurement and a stop distance measure switch 208SP is operated to denote the positioning of the cursor on the irregularity. Where an irregularity is encountered, the measure of distance is entered as the distance of movement of the cursor from line 205 to the irregularity. Pushbutton array 209 is used to enter a code for the type (T) of irregularity encountered. The vector data can be outputted from terminal 200 by data coupler 200C to general purpose computer 20. Figure 8 shows the tracing of an ulnar loop generated by exploration from a horizontal line through the core. Points of intersection are shown numbered outwards from the core, and characteristics accessed are highlighted with a small 'blob'.

A standard length for digital vectors was set at 82 digits that is, 41 pairs — of which 20 pairs represent up to

20 ridges on the left hand side of the core, one pair represents the ridge on which the core itself is located, and the other twenty pairs represent up to 20 ridges intersected on the right of the core. Whenever less than twenty ridges are intersected on the left or the right hand side of the core (which is usually the case) the 82 digit code is padded with ' F's, as mentioned above, to bring it up to the standard length. The padding is done at the extreme ends of the vector in such a way that the digit pair representing the core-ridge remains in the central position (i.e. the 21st digit pair).

The convention is established that the digit representing exploration along a ridge upwards from the line is to be written first (of the pair), and the digit representing exploration downwards along the same ridge is written second. Adhering to that convention, the 82 digit vector generated from the tracing referred to above (figure 8) is shown in figure 9. To facilitate interpretation, the intersection point numbers (from figure 8) are shown also, with their corresponding digit pairs. (These intersect ion point numbers are not normal ly recorded, and they form no part of the topological code.) Digit pairs are juxtaposed, and each pair separated from the next. It is important to remember that each diqit pair is just that — a pair of digits; they should never be interpreted together as being one number.

2. MEASURING THE DISTANCE.

The distance is measured (on a 10x enlargements) in centimeters, and is then rounded down to the nearest integer, and an upper bound of 15 imposed. On the actual print, therefore, the distance measures represent the distance, measured along ridges, from generating line to ridge-event, rounded down to the nearest millimetre. Thus the only possible distance measures are the integers 0, 1,2.... 15. If the ridge-event codes are any of the set 0, A or B then the corresponding distance measures are set to a default value of 15. These codes 0 ('out of sight'), A ('scarred tissue') and B ('unclear') cannot really have meaningful distance measures associated with them: all the other event codes can.

Although this description has specified that distances should be ' ridge-traced' it is not absolutely necessary, although desirable. The linear distance from each ridge- intersection point to each ridge-exploration event could be used instead. Similarly, the perpendicular distance from each ridge-exploration event to the generating line could be used instead.

The above description also suggests an array length of 82. System design constraints may determine some other length, but greatest accuracy is achieved when the length exceeds 62.

THE VECTOR COMPARISON ALGORITHM MATCH 4.

There are ten d i st inct phases to th is al gori thm ; two are preliminary and eight form the actual comparison process. Each will be described in turn.

Preliminary stage 1 fileset analysis

P(j, k, I) : j = 0, 15 j represents one of the hexadecimal 'event' codes.

. k = 1, 9 k is the ridge-band number (numbered from left to right). : l = 1, 2 I shows one of two 'directions'.

(l = 1 for 'upwards': i.e.first digit of a pair.) (l = 2 for 'downwards': i.e.2nd digit of a pair.)

The combination of any value of k with a value of 1 specifies one of 18 possible 'ridge areas'. P(j,k, 1) is the proportion of codes in the. (k, 1) ridge area that had the value J.

1.0, for any fixed

PRELIMINARY STAGE 2 SETTING UP THE SCORE-REFERENCE MATRIX.

From the three dimensional frequency matrix P, a four dimensional Score-reference matrix, S, is constructed. S is to be regarded as a 'look-up table' of initial scores to be awarded during the vector comparison process.

A score S(i,j,k, 1) will be awarded initially when code i appears in the search vector opposite code J in the file vector, in corresponding (digit) positions which fall in the (k, l) ridge area. Again, this stage is not concerned with distance measures. That score S(ι,j, k, l) is an indication of the value of such a coincidence in indicating that the search and file vectors under comparison are matched. It could also be regarded as a measure of the unlikelihood of that coincidence occurring by chance had tbe file vector been selected completely at random from the population of 'all fingerprints'.

The calculation of the matrix 8 is done according to these rules:

(a) For each i_.j,k,l such that i = j and i,j ∈ {0,2, 3,4,6, 7,8,C} then S{i,j, k, l) =minimum ( BOUND , INT [10 x - log₁₀ P(j_, k_,l)])

where INT[...] means the integer part of [...] and BOUND is another parameter — it is an imposed upper bound on the values taken by elements of the matrix S.. The factor 10 appears to avoid all the exact match scores being either 0 or 1. The inclusion of this factor gives a reasonable spread of exact match scores, based on code frequencies, despite the integer rounding. Typically these scores range from 1 to 15 or so.

These elements of S are the 'exact match' scores.

(b) For all i, j, k, l such that at least one of i and j is either 13, 11 or 12 (i.e. hexadecimal A, B or C), except for the case i = j = 12, then. S ( i , j , k, l ) =0. 0

(S(i,j,k, l) such that the unordered pair (i,j) belongs to the set of unordered pairs { (2, 3), (3,4), (6,7), (7, 8))}.

(d) For all i,j,k, l not covered by one of the rules (a), (b) or (c) above:

S(i, j, k, l)= -1

The matrix S (when there are 9 ridge bands) could be regarded as 18 different comparison tables each one of which might typically appear as shown below. (Here the close match scores have been set to 2 and an upper bound of 15 applied. Also the exact match scores have been rounded to the nearest integer for ease of presentation.)

Table of S(i,j, k,l) for a fixed (k, l) with upper bound 15 and close match scores 2. COMPARISON STAGE 1 --- FORMATION OF FILE AND SEARCH MATRICES.

The vector comparison process itself begins with a file array (B(i): i = 1,82), a search ay-ray (A(i): i = 1,82) and the established score reference matrix S.

An important parameter not yet introduced is "MAXSHIFT". MAXSHIFT is the maximum number of ridge shifts (either to left or right) that is to be anticipated by the comparison algorithm.Such shifts are likely to have occurred as a result of distortion caused by core misplacement, appearance or disappearance of subsidiary ridges and line placement errors.

Let us suppose that up to 5 ridge shifts should be anticipated (i.e. MAXSHIFT=5) . Then comparison of array A with array B will need to allow for relative shifting by up to five digit-pairs. This is accomplished by use of standard array processing techniques as follows:

(a) The topological vector portion of the search array A is used to construct the search matrix "C". C will have 82 columns and the number of rows will be given by [(2xMAXSHIFT) + i]. Each row will be a copy of the topological part of the array A, but the copy will be progressively shifted to the left or right by from 0 to MAXSHIFT digit pairs. The central row will be an exact copy of A. The top (first) row will show A shifted 5 digit pairs to the left; the second row...4 digit pairs to the left; the bottom row...5 digit pairs to the right. Some digits of A may be ' lost' off the ends of some of the rows — and gaps caused by the shifting are padded with pairs of ' F' s.

(b) The file array B is used to create a file matrix, D, of identical dimensions to C. It is formed by faithful duplication of the topological part of the array B, without shifting, the appropriate number of times. Every row of D is an exact copy of the vector from B. No padding is needed, and no digits are lost from row ends.

COMPARISON STAGE 2 --- COMPARISON OF FILE AND SEARCH MATRICES.

For each value of r and s the element E(r, s) depends only on C(r, s) and D(r, s). Each element E(r, s) is evaluated by 'looking up' C(r, s) and D(r, s) in the score reference matrix S:

k and 1 are picked, for each s, to represent the

'ridge-area' to which the 's'th element of a vector would belong.

Thus k will increase from 1 to 9 as s varies from 1 to 82, and 1 will be 1 if s is odd, 2 if s is even. In other words C(r, s) and D(r,s) are 'looked up' in the 'book' of comparison tables called'S'. The values (k, l) are evaluated (from s) just to make sure that the appropriate table is ' looked up'.

PROPERTIES OF THE INITIAL SCORE MATRIX.

The feature of the initial score matrix E that begins to

presence (or absence) of horizontal strings of non-negative scores. Such a string within one row of E represents similarly placed rows within matrices C and D that were similar, or identical. Such strings, in turn, represent parts of the vectors A and B that were similar or identical. Where a high scoring continuously non-negative string occurs in the central row of E then vectors A and B are probably mates, and are correctly aligned. If such a high scoring string appears in one of the other rows of E, then A and B were probably mates, but incorrectly aligned (i.e. there had been some shifting error).

If, on the other hand, the matrix E appears to be a random scattering of scores with no discernible concentrations of non-negative scores, then it is likely that A and B were not mates. The task facing the remainder of the algorithm is to calculate a single score which will show whether sigrvificant strings are present in the matrix E, or not — and thus provide an indication of whether A and B are mated vectors.

Anticipation of this 'adding together' was the origin of the rules used in setting up the score matrix S. The significance of scores of 0.0 (rule (b) in preliminary stage 2) is that their appearances within the initial score matrix E do nothing to the sum of a series, but they do preserve its continuity. Thus, appearance of scars, or inability to determine what does happen first during ridge exploration, is not given any significance in indicating a match — but it is not allowed to break up an otherwise continuous non—negative sequence that would be indicative of a match. Hence the 8.8 allocation to any comparison involving codes 'A' or 'B'. Comparisons involving code' C were also allocated scores of 0.0, because true compounds are very rare and what normally appears as a compound is usually an ambiguous characteristic of some other sort.

COMPARISON STAGE 3 --- APPLYING THE DISTANCE TESTS.

During a print comparison the distance measures are used in the application of three different tests. All three tests are applied to the initial score matrix in such a way as to reduce (to -1) any positive initial scores that the distance measure tests indicate ought to be so reduced. This will occur if the distance measure tests show that the matched event codes (which gave that positive value) are from' events' that are not roughly in the same area (spatially) of their respective prints.

ABSOLUTE DISTANCE TEST.

Before the matching algorithm accepts an event code in a file print array as possibly being correctly matched with an event code in the search print array — it now has to ask not only 'are the event codes the same?', but also a number of questions relating to their distance measures. The first is called the absolute distance test:

'Is the distance between the generating line and the ridge-event adequately preserved? (i.e. is it preserved within a given tolerance),. The tolerance allowed becomes a parameter of the programme and is called the absolute distance tolerance (ADT).

DIFFERENTIAL DISTANCE TEST.

'Is the difference in their distance measures adequately preserved?'

HOW THE DISTANCE TESTS ARE APPLIED.

The distance tests are applied as the first filtration step for the initial score matrix E The manner of their application (briefly) is as follows:

(a) Absolute distance test: every positive element, E(r, s), of the initial score matrix E is derived by comparison of C(r, s) and D(r, s) — elements of the search and file matrices. We call the related distance measures C'(r, s) and D' (r, s) respectively. The rule for the absolute distance test is:---

If |C'(r, s) - D' (r, s)|) ADT then change E(r, s) to -1.

(b) Differential distance test: whenever E(r, s) and E(r, s + 2) are positive elements within E then

If | (C'(r, s) - C'(r, s +2)) - (D' (r, s) - D' (r, s +2))|) DDT then change one of E(r,s) and E(r, s + 2) to -1. (Which of the two is reduced depends on other neighbouring elements within E. )

If |(C'(r, 2s) + C'(r, 2s - 1)) - (D' (r, 2s) + D' (r, 2s -.1))|> SDT then one of E(r, 2s) and E(r, 2s - 1) is reduced to -1. (In this case the largest of the two is reduced.)

COMPARISON STAGE 4 --- FILTERING FOR DEPENDENT PAIRS.

The matrix E is therefore filtered, and the filtered score matrix (F) created. F has exactly the same dimensions as E, D and C. The filtering step involves a reduction of scores stemming from repetitions of dependent code-pairs. It is accomplished by reference to the matrices C and D (to identify exactly where such pairs appeared in both).

The rule for score reduction is wherever E(r, s) and E(r, s+ 2) are exact-match scores derived from a dependent pair then: F(r, s) = rnin(E(r, s), E(r, s + 2)) F(r, s+2)=2.0

Elsewhere F(r, s) = E(r, s).

COMPARISON STAGE 5 --- CONDENSING DIGIT PAIRS TO A SINGLE SCORE.

The condensing rule applied in MATCH4 is: -1 if F{r, 2a - 1) and F(r, 2a) are both -1; G(r,s)= Maximum(F (r, 2a - 1), F(r, 2s)) if one, and only one, is non-negαtive; F(r, 2a - 1) + F{r, 2s) if both are non-negative.

Thus isolated -1's cease to break up the long series thht result from mated vectors. The sums of these long series from matches are expected to far outweigh the sums of any continuously non-negative series which occur by chance, (i.e. from a vector mismatch).

COMPARISON STAGE 6 --- 'HOPPING' IN THE CONDENSED MATRIX.

These string breaks may have occurred as a result of two topological mutations (one on either side of the generating line) that just happened to affect the same ridge; that would cause an isolated negative entry in an otherwise continuously non-negative string in the condensed matrix. Alternatively ridge-shift ing (with its variety of causes) may have occurred; this will break the string as a result of inclusion or deletion of a digit pair from one of the vectors under comparison. The result will be that part of the string in the condensed matrix is displaced either to the row above, or to the row below (as shown here).

Part of a condensed matrix showing a suitable 'hopping' place. The algorithm is designed to recognise this phenomenon, and to put these broken strings back together again (i.e. to evaluate their sums as if they had not been broken). A parameter "HOPS" is used, which indicates the maximum number of breaks which can be overlooked in evaluation of any one series score.

The parameter is called "HOPS" because, in effect, the programme is allowed to hop from the right hand end of a series onto another point where that string is thought to be continuing. The permissible hops in the condensed matrix G ar e from any point g(r, s) to any one of these three points: —

(b) g(r + 1, s + 2) or g(r - 1, s + 1): these ar e the hops required to repair a string break caused by insertion or deletion of one digit pair from the search or file vector. (To see why these particular hops are appropriate one must study the effect of ridge shifting on the staggered search matrix C.)

These three particular hops are not the only ones that could have been allowed; hopping from g (r, s) to either of g (r + 1,s + 3) or g (r - 1,s + 2) can be useful in repairing breaks caused when the generating line passes the wrong side of a bifurcation. The selection of the three described above, however, has been found to be the most effective selection in aiding match scores without unnecessarily aiding mismatch scores.

The calculation of such scores is accomplished by a further series of simple array operations. They are not described here. It is worth pointing out that the number of operations required for this step increases linearly with the value of HOPS, and not exponentially as might have been expected. In the algorithm for MATCH4 the hopping section is one single iterative loop, which is repeated HOPS times. It is bypassed whenever HOPS is set at zero. COMPARISON STAGE 7 --- PRODUCT CALCULATION AND SCORE FORMULATION.

Formulating a score from the condensed matrix 6 provides a further variety of options. Score evaluation is made dependent on the single highest-scoring series in the condensed matrix rather than on a combination of all the different string sums. The best series invariably scored so much higher than all the others that it rendered them almost insignificant. Ignoring strings other than the best one obviates the need to take antilogs, add, and then reconvert to logs.

The score thus obtained is logarithmic in nature.

COMPARISON STAGE 8 --- SCORE NORMALIZATION PROCEDURE.

Examination of the lower match scores from MATCH4 showed that they were often produced when the search prints had been of relatively low quality: some were badly scarred (producing many ' A' s in their vectors) and others were not clear in parts (producing many ' B' s). With high proportions of ' A's and 'B's present — and perhaps with a high proportion of ridges running 'out of sight' — large scores were just not possible, even if that vector had been faithfully reproduced within the file set.

The intention of score-normalization was to adjust scores from each comparison according to the amount of, or lack of, good information in both the search and file prints. The justification for such a procedure lies in this argument: if a search vector contains little information. and a large part of it is found in a file vector, then this may be just as significant (in indicating a possible match) as had the search vector had plenty of information, only a little of which had appeared in the file vector. A mediocre score from a poor print is better than a mediocre score from a good print.

The method used in MATCH4 was to compare the search and file vectors each with itself (using the matching algorithm) and' see what scores were obtained. Those scores are a very meaningful indication of the quality (i.e. rarity) and quantity of information in the search and file vectors. They represent the sum of one continuous string in the condensed matrix which covers the whole length of the vectors. They are for each, the perfect score. They are the maximum that could possibly be achieved by any vector compared to them. This would not need to be done for the file vector every time a search was conducted; each file vector would have its self-mate score calculated just once when it was introduced to the collection; the self-mate score would then be stored along with the file vector, and it would be referenced each time that file vector was used in comparison. A file vector's self-mate score would have to be recalculated only when the scoring system, for that file, was reappraised by a new fileset analysis. Suppose there were n vectors in the file — called

Then the normalization used gives a final score of:

This formula gives final percentage scores. Scores thus normalized appear as real numbers in the range 8 to 100. Real numbers are only used at this very last stage of the comparison process. The raw score (before normalization) was an integer.

TOPOLOGICAL COORDINATE SYSTEMS

In its most general form the system designed records the topological irregularities as a series of small changes in what is otherwise assumed to be a smooth laminar ridge-flow pattern. A 'sweeping line' or 'scanning' system, shown generally in Fig. 11, is used, whereby a scanning line SL moves across the fingerprint in a predetermined manner. Whenever it passes over an i rregu l ar i ty ( wh i ch may e i t her be a ' characteri st ic' or some other type of irregularity such as a ridge corning into sight, or going out of sight), the irregularity is recorded by the use of 3 coordinates:

(1) a type code (T) to show which type of irregularity it is.

(2) a measure (M) of the scanning line position when it hits the irregularity.

A collection of coordinate sets of the form (T, M, R) specifies the topology of a fingerprint (or any part thereof) uniquely. A fourth coordinate (D) may be added - which will record the actual linear distance between the irregularity and a predetermined point on the scanning line (which may or may not be the same point as that used in determining the ridge count). Then it will be seen that (D) and (M) together are sufficient to specify the spatial position of each irregularity. Thus coordinate sets of the form (T, M, R, D) give a complete topological and spatial description of the fingerprint scanned.

The particular scanning system selected as most suitable for use with fingerprints is the radial scanning line system, where the scanning line is pivoted at a selected (fixed) point on the print, and (M) takes the form of an angular measure (θ), where (θ) is the angle between the position of the scanning line and some fixed predetermined direction. The 'pivot' point, in this case, is used as the predetermined point for measuring ridge counts (R), and for recording the distances (D) in the four coordinate system. The pair (D, θ) therefore becomes analogous to polar coordinates. The scanning pattern selected is simply a clockwise sweep of the pivoted radial scanning line. TOPOLOGICAL COORDINATE SET EXTRACTION (FIGURES 2, 2(a),

2(b), 5, 11, 12(a), 12(b) and 12(c)

The topological coordinate set extractor shown in Fig. 2(a) has a split display screen 100, the upper screen 100U displaying the output from scanner 13 of Fig. 1, and a lower screen portion 100L which displays the enhanced or thinned image from image enhancement unit 14 of Fig. 1. The gray scale image in display 100U is used to allow the operation to locate voids, scars, discontinuities, and a count is made of the ridges running into and out of voids, scars, discontinuities, etc.

In this embodiment, scanning of the enhanced displayed fingerprint 101E is by a rotating ridge scan line 102 which begins its scan of print 101E from a predetermined scan start position or 'cut' 102S and rotates in a predetermined scan direction such as clockwise (see Fig. 12). The origin or observation point 104 is the center of rotation of scan line 101E and is shiftable by track ball 105 in controller 106, and as noted above, preferrably has the center of rotation or observation point 104 located just off of any ridge. The right side 100R of display 100 may be used together with the thumbwheel cursor controls 105 and 106 to cause a relative rotation of the scan line 101E, and provide a measure of the angular displacement (θ). Movement of a cursor point to the characteristic being coded will give a measure of the radial distance which can be calculated when a key on the key pad or pushbutton array 107 is depressed.

The horizontal and vertical line scan system shown in Figures 12(b) and 12(c) may be implemented using a topological coordinate extractor shown in Figure 2(b) which, except for the use of horizontal and/or vertical scanning lines, is similar to Figure 2(a). As in Figure 2(a), a solit screen 100' uses the upper screen 100U' to display the gray scale image to allow the operator to locate voids, scars, discontinuities, and obtain a count of ridges running into and out of the voids, scars, etc. The lower portion 100L' of the display 100' displays the thinned or enhanced image of the print from image enhancement unit 14 of Figure 1. The apparatus of Figure 2(b) provides the user with the choice of two (or more since the radial scan feature of Fig. 2(a) can easily be accomodated in this unit) scan lines, horizontal or vertical which may be implemented automatically under computer control to perform the scanning functions illustrated in Figures 12(b) and 12(c).

The horizontal scan line HSL is activated by switch HC and the vertical scan line VSL is activated by switch VC. In each case, the scan line is moved across the print image ( horizontally for vertical scan line VSL or vertically) for horizontal scan line HSL) by scan thumbwheel STW. Points of tangency are 'recurves' (labeled RCV or RCH in Figures 12(b) and 12(c), respectively) which are found wherever ridges are tangential with the scanning line. They are treated as irregularities and used for topological reconstruction purposes, either as preliminary to matching or pictorial reconstruction. Movement of a cursor from a prescribed origin "V" or "H" to a particular irregularity (or recurve) can be performed by cursor thumbwheel controller CTW. A code type is entered on pushbutton keyboard CKB. As the cursor is moved from the origin to the irregularity (IH and RCH of Figures 12(b) or IV and RCV of Figure 12(c)), the number of ridges intersected by the scanning line and the predetermined point ("V" or "H") is recorded automatically along with a measure (M) of the scanning line position when it meets the irregularity. As noted earlier, the actual linear distance between the irregularity and the point on the scanning line may be used to provide a fourth coordinate (D) which may be added information recorded concerning the irregularity. (When (M) is measured on the X-axis, (D) can be measured on the Y-axis.

In Figure 1c, there is illustrated a generalized block diagram of a topological coordinate set extractor system as an expansion on the block diagram of Fig. 1a. After the image of the print has been enhanced by image enhancement unit 14, the enhanced image is supplied to topological coordinate extractor 16 which includes minutiae detector 16MD, which is standard. Each detected minutiae is assigned a type code (T) by type code generator 16G which type code is supplied to coordinate set processor 20CS in general purpose computer 20. When the horizontal scan line HSL (Fig. 12b) is moved vertically, the scan locating data (M) of each successive characteristic is forwarded by detector 16MD to coordinate set processor 20CS. In regards to Fig. 12a, the angle (θ) constitutes the scan location data (M). In Fig. 12b and Fig. 12c, the distance from the Y-axis or the X-coordinate constitutes (M). Any blurs such as scars or voids found by scar/void detector 16V (by virtue of the absence of descernible ridge lines in a given area). When any scan line passes through a detected void or scar area, no ridges are counted wwithin that region. The points where ridges run into, or out of, such a region are recorded as irregularities. As noted in Fig. 1c, ridge counter 16R counts ridges from the central observation point 104 in the radial scan system (Fig. 2a) or from selected observation points V or H (Figs. 2b, 12b and 12c).

Distance calculator 16DC provides the distance measurement (D). In regard to Fig. 12a, the distance from the central observation point 104 to the irregularity is the distance (D) along the radial scan line to the irregularity as computed by a simple arithmetic operation on the spatial coordinates X, Y; and in Fig. 12b and Fig. 12c, this is simply constituted by the X or Y coordinate for the minutiae position. In some cases it may be desirable to use more than one scan line, such as both horizontal and vertical scan lines to obtain more data, and obviously, the scan line need not be horizontal or vertical.

COORDINATE SCHEME: FILE PRINT CODING - RADIAL SCANNING.

ridges.

Code 7 bifurcation ahead (i.e. a' 2' reversed).

Code A — ridge runs into scarred tissue.

Code B — rid ge runs i nto an uncl ear area.

Code C — compound characteristic (2 ridges in, and 2 ridges out).

Code D — ridge emerges from scarred tissue ('A' reversed).

Code E — ridge emerges from unclear area. ('B' reversed).

LATENT SEARCHING: TOPOLOGICAL COORDINATE SYSTEMS.

The most desirable latent oata form is a complete and objective description of the latent tracing. The tracing process itself still is, and always will be, substantially subjective — but it ought to be the last stage requiring subjective judgement. A set of topological coordinates of the form (T, θ, R, D), (showing type, angular orientation, ridge-count and distance) provides a complete topological and spatial description, and it therefore becomes the basis for latent data entry. The latent mark data can then be presented i n much the same form as the f i le print data.

The manual latent data preparation process is iairly simple: first the mark is traced (enlarged to 10x magnification). Then the position of the central observation point is guessed by the fingerprint expert, and its position marked on the tracing. The assumed core point position may be some way away from the ' visible' part of the latent. Then the correct orientation of the mark is estimated by the expert, and the coordinates of the characteristics, and other irregularities can then be written down. Fig 5a discloses an extremely useful tool, for this operation. A large board (50) with a pin hole (51) at its centre has angular divisions marked around the circumference of a 7 or 8 inch circle (i.e. much like an oversized 360' protractor). A transparent ruler (52) is then pivoted at the pinhole (51) in the centre. When the tracing has been made it is placed over the board (50), the pivot pin (51) pressed through the central observation point (51). The tracing falls entirely inside the protractor markings, and the ruler is long enough to reach those markings. Radial movement of the transparent ruler (which has one central line (53) on it) over the tracing makes it very easy both to count the ridge-counts for each irregularity, to measure radial distances (these are marked on the ruler in the appropriate units), and to read off the angular orientations from the circumference of the inscribed circle. We shall record radial distances in units of 0.5m. (or 0.5cm on the 10x enlargement) and round to the nearest integer. No greater accuracy is either required or useful. These distances then appear as integers in the range 0 to 50. The type code (T) is a hexadecimal integer, the angular orientation (θ) an integer in the range 0 - 360, and the ridge count (R) an integer m the range 0 to 50. The total storage space required for all four coordinates is therefore close to 3 bytes; to be precise, it is 25 bits.

14.1.3 'WRAP AROUND' 360° SECTOR.

The sector to be recorded may be a small part of a fingerprint, in which case the area to be coded should be a sector as enclosed by two radial lines. However the normal assumption is that the whole of the visible fingerprint pattern will be coded. Such a sector can be enlarged at will by moving, the radial boundary lines, until such time as the internal angle reaches 360°. At that stage the two boundary lines coincide and where they coincide will be called the cut. Provided our topological reconstruction algorithm can cope with the fact that, at the cut, some ridges effectively leave one end of the sector and reappear at the opposite end, then we can forget about the eκistance of the boundary lines altogether.

The reconstruction algorithm will need to be told how many ridges need to be connected up in this way and that number

(which is the number of ridges that cross the cut) will be recorded as a part of the fingerprint data. It is convenient to specify that the cut will be vertically below the central observation point, and that the ridges which coss it be called moles (as they pass underneath the observation point).

The coordinate system can row be used to describe the complete topology of a whole fingerprint.

TOPOLOGICAL RECONSTRUCTION FROM COORDINATE SETS.

Let us suppose that the print to be reconstructed has m moles and n topological irregularities, whose coordinates are the set (T_i, θ_i, R_i, D_i): i - 1,...n.

THE 'CONTINUITY' ARRAY.

This reconstruction method involves the systematic development of a large 3- dimensional array, which will be called the 'continuity' array (C) comprising elements c(i,j, k). To understand the function oi this array it is necessary, first, to examine figure 12: it shows a (simplified) fingerprint pattern with selected central observation point and the radial cut vertically downwards. A radial line from the central observation point is drawn marginally to the clockwise side of every topological irregularity in the picture (whether it be a true characteristic or not). If there are n irregularities (which we will call (l_l . . . I _n) , there are n + 1 radial lines in total (this includes the cut). Calling the cut line r

and numbering the lines consecutively in a clockwise direction gives the set of lines

Now re-order the topological coordinate set by reference to the second coordinate ( θ) — so that the coordinate set satisfies the condition:— θ_i ≤ θ_i+1 for all i ∈ {1,2, ...n-1}

There are then simple 1-1 mappings between the lines the irregularities and

their coordinates ( T _i , θ _i, R_i, D_i): i = 1 ... n).

Each of the lines intersect a certain

number of ridges, giving an ordered sequence of ridge intersection points. Let the number of ridges crossed by line be called r_i. Further, let the ridge intersection points on the line 1_i be called points {p(i,j): j = 1,...r _i} point p(i,1) being the closest to the central observation point and p(i, r_i ) being the closest to the edge of the visible print.

The continuity array C is then set up with a direct correspondence between the ridge intersection points p(i,j) and the elements of C, namely c(i,j, k). k takes the values 1 to 4, and thus there is a 4 to 1 mapping of the elements:

{c(i,j,k) : i = 0,...n :j = 1,...r_i : k = 1,2,3,4}

onto the set of ridge intersection points:

{p{i,j) : i = 0,...n:j=1,...r_t}

c(i,j, 1) --- "what is the first event that topological exploration from the point p(i,j) in an anticlockwise direction will discover?" c(i,j,2) --- "which of the irregularities l_l ... I_n is it that such anticlockwise exploration will discover first?" c(i,j,3) — "what is the first event that topological exploration from the point p(ι,j) in a clockwise direction will discover?" c(i,j,4) --- "which of the irregularities I_l ... I_n

* The part of the matrix C which will be used for any one print is therefore irregular in its 2nd (J) dimension. is it that such clockwise exploration will discover first?" c(i,J, 1) and c(i,J, 3) should, therefore, be ridge-tracing event codes in the normal hexadecimal integer format (not to be confused with the different set of hexadecimal codes currently being used for the irregularity type (T_i)). c(i,J,2) and c(i,J,4) ar e integers in the range 1—n which serve as pointers to one of the coordinate sets. They are a kind of substitute for distance measures (being associated with c(i,J, 1) and c(i,J,3) respectively) but they act by referring to the coordinates of the irregularity found, rather than by giving an actual distance. They will be called irregularity indicators in the following few sections.

OPENING THE CONTINUITY ARRAY.

To begin with, the whole of the continuity array is empty

(and, in practice, all the elements are set to -1). It will be filled out successively starting from the left hand edge (i = 0) and working across to the right hand edge (i = n).

Starting with i = 0 (at the cut--, FIG. 12a) we know only that r_o = m (the number of ridges crossing the cut is the number of moles recorded in the data.) Nothing is known (yet) about any of these ridges. The first set of entries in the possible ridge exploration from the line l_o

The dummy numbers are integers in a range which cannot be

The elements {c(0,J, k) : J = 1,... r : k = 2 or 4} are left untouched for now.

ASSOCIATION, ENTRIES, AND DISCOVERIES IN THE CONTINUITY ARRAY.

The next stage is to consider each of the coordinate sets

* In practice dummy numbers start at 100 and, whenever another one is needed, the next fr ee integer above 100 is used.

Obviously a record is kept of how many different dummy numbers have been assigned. (T_i, θ _i, R_i, D_i) in turn starting with i = 1. We know that the irregularity l_l is the only change in the laminar flow between lines l_o and l_l. We also know its type (T_l ) and its ridge-count (R_l) · Depending on the type T_l there are various associations, entries and discoveries that can be made in the continuity array.

Suppose, for example, that T_l = 3 ( i . e. a ridge ends — according to the table of irregularity types). We can deduce that

r_i = r_o -1

(i.e. line l_l crosses one less ridge than line l_o and we can make the following associations in the second column (i = 1) of the continuity array. (Associations occur when one element of the arr ay is set equal to another. )

c(1, j, 1) = c(0, j, 1) for all 1≤ j≤ R ₁ - 1, c(1,j, 3) = c(0, j, 3) for all 1≤ j≤ R₁.

(i.e. ridges below the irregularity pass on unchanged) also:

c(1, j, 1) = c(0,j + 1, 1) for all R₁ + 2< j< r₁t c(1,j, 3) = c(0,j + 1, 3) for all R₁ + 1≤ j≤ r₁. (i.e. ridges above the irregularity pass on unchanged, but are displaced downwards by one ridge, due to the R₁ + 1'th ridge coming to an end.)

Further information is gained from the immediate vicinity of the irregularity and this allows us to make entries in the array. (Entries result directly from the coordinate set being processed, rather than by copying from another part of the array).

c(1.R₁,1) = 8, c(1, R₁,2) = 1, c(1,R₁ +1,1) = 6, c(1,R₁ + 1,2) = 1.

(i.e. the line 1, is drawn marginally past the ridge-ending I_l , and so that ridge-ending appears as a facing ridge ending in anticlockwise exploration from ridge intersection points p(1, R₁) and p(1,R₁ + 1). The event seen, in each case, is I_l itself.) We also have discovered what happened to the ridge that passed through the point p(0, R₁ + 1): it ended (code 3) at irregularity I_l . That discovery enables us to note the fact that the ridge exploration clockwise through point p(0, R₁ + 1) ended here. The existing entry in c(0, R₁ + 1,3) is a dummy number, and the new found meaning for that number is recorded in the dummy number index. Suppose the dummy entry had been the number 187: then we store its meaning thus:

index (107)= (3, 1)

Eventually all the appearances of the number 107 in the array will be replaced by '3', and, at the same time, all the associated irregularity indicators will be set to'1'.

Knowledge of T ₁ and R₁ has therefore enabled us to make a particular set of associations, entries and discoveries — from which it has been possible to place something (either entries or dummy numbers) in all of the elements of the set:

{c (1, J, k) :J =1, 2, ... r₁ k = 1 or 3}

The process now begins again, with examination of irregularity I ₂ followed by I₃ .... I_n . Each different possible tyoe code T_i generates its own individual set of associations, entries and discoveries. Each set allows the next

column of C to be filled in. * It should be pointed out that whe association is made of event codes (as distinct from dummy numbers) then association is also made of their respective irregularity identifiers.

After all the n coordinate sets have been processed (and entries thereby made in the whole of the continuity array) a few last associations need to be made in order to account for the fact that ridges cross the cut. These associations ar-e that:

c(O ,j, 1) is equivalent to c(n,j, 1) for all 1≤ j≤ r_o, and c(n,j,3) is equivalent to c(O,j, 3) for all 1≤ j< r_o. (Of course r_o = r_n = m)

* Some of the entries may well be new (unassigned) dumm numbers. This occurs wherever new ridge segments start at th irregularity. It did not happen in the case of the ridge ending. PROPERTIES OF THE COMPLETED CONTINUITY ARRAY.

(a) al 1 the elements {c(i,j, k) : 0≤ i ≤ n : 1 ≤ j ≤ r_i : k = 1 or 3} contain either ridge exploration event codes (hexadecimal) or dummy numbers (integers over 100).

(b) wherever c(i,J,1) or. c(i,J,3) is an event code, then the corresponding entries, c(i,J,2) or c(i,J,4) respectively, will contain an irregularity identifying number that shows where that ridge event occurs.

(c) all the different appearances of a particular dummy number. in the continuity array reveal all the intersection points through which one continuous ridge exploration has passed. (Hence the name for the ar ray. )

FINAL STAGE OF TOPOLOGICAL RECONSTRUCTION.

The final stage of the reconstruction process is to sweep right through the continuity matrix replacing all the dummy numbers with their corresponding event codes from the index. The related irregularity identifiers ar e filled in at the same time, also from information held in the index. This second (and final) sweep through the elements of the continuity ar ray leaves every element in the set:

(c(i,J,k):i=1...n:J=1...r_i :k=1or3)

as an event code, and every element of the set:

(c(i, J, k) : i = 1. ..nn: J = 1...r_i :k= 2 or 4)

as an irregularity identifier.

For any particular line l_i the entries of C in the ith column correspond exactly to the elements of a topological code vector generated by that line. The only difference in appearance is that we have irregularity identifiers rather than distance measures to go with each exploration event code. The later vector comparison stages of the matching algorithm are adapted with that slight change in mind.

This completes a somewhat simplified account of a rather complex process. There are other complications which have not been explained in full — such as how the algorithm deals with sequences of dummy numbers that are all found to be equivalent, and the special treatment that ridge recurves have to receive, and how the algorithm copes with multiple irregularities showing the same angular orientation. Nevertheless this explanation serves well to demonstrate the methodical and progressive nature of this particular reconstruct ion process. It also makes clear that only two sweeps through the matri x are requ i red --- wh ich is surprisingly economical considering the complexity of the operat ion.

THE MATCHING ALGORITHM LM6. (APPENDIX A, FIGURE 19) .

The algorithm LM6 (Appendix A) accepts latent data in coordinate form, rather than by prepared vectors. Topological reconstruction was performed both on the latent mark (once only per search) and on each file print to be compared with it. Thecontinuity matrix generated from the latent coordinate set will be called the search continuity array, and the continuity ar r ay generated from the file set will be the file continuity array.

There are two distinct phases of print comparison which take place after these topological reconstructions are complete. Firstly, the appropriate vector comparisons are performed and their scores recorded — secondly, the resulting scores are combined to give an overall total comparison score. The vector comparisons are essentially a way of comparing the topological neighborhoods of each of the characteristics seen on the fingerprints under comparison. The vectors correspond exactly to the topological code vectors (of the type described in connection with the vector matching algorithm MATCH4) that would be generated by each of the radial lines as shown in figure 12. There is one radial line per characteristic, therefore one extracted vector per characteristic. It is most important to realise that, according to the invention, the observation points selected on the two prints under comparison do not need to have been in the same positions. The reconstructed topology will be the same no matter where it was viewed from. Just as two photographs of a house, from different places, look quite different — nevertheless the house is the same. The final comparison scores will be hardly affected by misplacement of the central observation point provided they lie in roughly the right region of the print. The reason for approximately correct placement being necessary is that the orientation of the imaginary radial lines, which effectively generate the vectors after reconstruction, will depend on the position of the central observation point. The effect of misplacing that point (in a comparison of mates) is to rotate each generating line about the characteristic on which it is based. Such rotation is not important provided it does not exceed 20 or 30 degrees. Slight misplacement of the observation point is not going to materially affect the orientation of these imaginary generating lines, except those based on characteristics which are very close to it. Specifying that the central observation point should be adjacent to the core (in the case of whorls or loops) and at the base of the ' upcurve' (in the case of plain arches) is a sufficiently accurate placement rule.

THE VECTOR COMPARISON STAGE.

From the search continuity array a vector is extracted for each true characteristic on the latent mark. Vectors are not extracted for the other irregularities ('ridges going out of sight', 'ridge recurves', etc.) If the latent mark shows 13 characteristics we then have 13 vectors, each vector based on an imaginary line drawn from the central observation point to one of those 13 characteristics, and passing marginally to the clockwise side of it. Let us now forget about all the other topological irregularities in the coordinate list and number the characteristics 1,2,3,... k. If the number of coordinate sets, in total, was n then certainly k ≤ = n. The extracted search vectors can now be called S_l ....S_k In a similar fashion the extracted file vectors, each based on true characteristics, can be called F_l .. . F_m.

This selection essentially looks for file, print characteristics that ar e potential mates for the search print characteristics. The vector comparison that follows serves to compare their neighborhoods. It is quite obvious that allowing a wide angular tolerance significantly increases the number of vector comparisons that have to be performed. If a small angular tolerance is permitted then a badly misoriented latent mark may not have the mated vectors comprred at all. The vector comparison itself is much the same as used hitherto--- except that the vectors contain irregularity identifiers rather than distance measures. At the appropriate stages of the vector comparison subroutine the actual linear distance ('as the crow f l ies' ) from the central characteri st ic to the ridge— event is calculated by reference to the appro— priate coordinate sets. Thus ordinary spatial distances can be used, and a great degree of reliability can therefore be attached to them.

For each search vector S_i , and candidate file vector F_j , a vector comparison score q_{i j} , is obtained. For each search vector S_i a list of candidate file vectors, with their scores, can be recorded in the form of a list of pairs (J.q_{i j} ) , There are typically between 5 and 15 such candidates for each search vector when the angular tolerance is set at 30°. These lists of candidates can then be collected together to form a table, which will be called the candidate minutia table. An example of such is shown below.

FINAL SCORE FORMULATION.

THE NOTION OF 'COMPATIBILITY'.

We learnt from earlier experiments with latent entry by vectors that combination of scores was best done subject to conditions --- and, in that case, the condition was correct relative angular orientation. It will make sense, therefore, to combine the individual candidate scores when, and only when, they are compatible.

If (J,q_{l j}) is a candidate m the S_l column, and (i,q_2i ) is a candidate in the S₂ column — then there are various reasonable conditions that can be set in respect of these two candidates before we accept that they could both be correct. We will say that these two candidates are compatible if, and only if, these three conditions hold true:

(a) i is not eαual to J. (Obviously one file print characteristic cannot simultaneously be correctly matched to two different search print characteristics.)

SCORE COMBINATION BASED ON COMPATIBILITY.

Step 2: In each column, discard all the candidates that do not come in the top five places.

Step 4: Taking at most one candidate from each column, pick out the highest scoring mutually compatible set that can be found. A mutually compatible set is a set of candidates each pair of which are compatible.

Thus a set of file print characteristics is found, each of which has similar topological neighborhood to one of the latent mark characteristics (as shown by their high vector comparison scores) and whose spatial distribution is very similar to that of the latent mark cnaracteristics (as snown by their compat i bi l i ty) . Spat ial considerat ions are therefore being used in the combination of topological scores — as is already the case at a lower level, when distance measures are used in the vector comparison process. The algorithm LM5 was originally written to perform the steps described above. Unfortunately it overloaded when it tried to do the comparison of a very good latent with its mate! The reason for this is that the algorithm will examine every possible mutually compatible set in turn. Certainly non-mates have very few mutually compatible sets of any size. However, if a good quality latent gives a largest compatible set of size N (i.e. N characteristics match up well with the file print) then there are 2^N - 1 subsets of that largest set, each of which will be a mutually compatible set. The total number of such sets is therefore at least 2 ^N, and probably much greater. In some cases N can be so large that the computer could not finish the job.

CANDIDATE PROMOTION SCHEMES.

1. Reorder the candidates in each column, by their scores.

The fourth step is calculation of what will be called a compatible score for each of the remaining candidates. Here ar e two possible alternative methods for doing this:

(b) For each individual candidate find, in each other column, the highest scoring compatiple candidate. Add together those scores (one from each column), and then add the target candidate's own score to the total.

After the promotion stage is complete all but the top ranked candidates in each column are discarded, and the compatible score for the remaining candicate in each column is then recalculated on the basis of only the other remaining candidates.

PERFORMANCE OF LM6.

Mates ranked in 1st place. 68.36%

Mates ranked in 1st-3rd. 82.14%

Mates ranked 1st-10th 85.71%

(a) Exact match scores were set to be b, with close match scores (CMS) set to be 3. Thus close match scores were given a higher relative weighting tnan previously used in the comparison of rolled impressions (where the optimum ratio had been 5:1) The higher weighting can be attributed to a higher incidence of topological mutation in the interpretation of latent marks.

(c) The ridge span used in vector comparison was 10 ridges — th is means t hat vectors of a st andard l ength of 40 d i g i ts, wi t h 40 associated irregularity indicators, were used whenever vector comparisons were performed. Tne results were no worse with longer vectors, but the smaller value for SPAN gave faster comparison times on a serial machine.

(d) The minimum angular tolerance (MAT) was 20°. This is almost inconsequential as the true angular misorientation limits were set individually for each latent mark (by subjective judgement) and written as a part of the latent search data.

(e) The candidate minutia selection depth ('DEPTH') was 5 throughout. This means that. for each search minutia, only the top 5 candidate file print minutia would be considered. This parameter was set to 5 as a result of observation, rather than experiment.

(f) The compatible score cutoff point ('CUTOFF') is the percentage of the latent mark's perfect self-mated score that must be attained by the final compatible score of a candidate file print minutia before it will be allowed to contribute to the final total score. The best value for this parameter was found to be 15%, which is surprisingly high. The effect of this setting was to ensure that the vast majority of file print minutiae that were not true mates for search minutia contributed nothing to the score; the net effect of this was to make most of the mismatch comparison scores zero. In fact, for 28.6% of the latents used, the true mate was the only file print to score at all — the other 99 file prints all scoring zero. Of course such a stringent setting also made things tough for the mates, as shown by the fact that 7% of the mate scores were zero also. However, these 7% were mates that had not made tne top ten places in any of the tests, and were therefore most unlikely to be identified anyway. It i s also worth point ing out that on each occasion when one f i le print alone scored more than zero (i.e. exactly 99 out of the 100 in the file collection scored zero) that one was the true mate. (These are the 28. 6 % ment ioned above. ) Th i s represent s a surprisingly h i gn l evel of what mi ght reasonably be termed 'doubt-free identifications'. COMPUTATION TIMES.

The foregoing description of the algorithm LM5 will have made it quite clear that this is not, in its present form, a particularly fast comparison algorithm but using the principles set forth herein, it can be significantly improved. The CPU time taken on a VAX 11/780 for the above test (5600 comparisons) was 12 hours and 11 minutes. That means an average CPU time per comparison of 7.8 seconds — which is a somewhat disconcerting figure when the acceptable matching speeds for large collections are in the order of 500 comparisons per second.

However 7.8 seconds per comparison is not quite so alarming when one considers tne extensive and multi-layered parallelism of the algorithm. At the lowest level, the vector comparisons themselves are sequences of array operations. At the next level, many vector comparisons are done per print comparison, in the score combination stages calculations of compatibility and compatible scores ar e all simple operations repeated many many times. There is, in this algorithm, enormous scope for beneficial employment of modern parallel processing techniques. It is hardly appropriate to take too much notice of the CPU time in any serial computer where each operation is done element by element.

Moreover, in the area of latent searching, the primary area of concern for law enforcement agencies is shifting from the issue of speed onto the issue of accuracy. It is quite reasonable to obtain the necessary speed through 'hardwiring' (with its associated cost) for the sake of matching algorithms that will actually make a substantial number of identifications from latent marks.

FILE STORAGE SPACE--- DEFAULTING THE 'EDGE TOPOLOGY' .

It is noticeable that the need to include all topological irregularities, rather than just the true characteristics, significantly enlarges the volume of the file print data. In the 100 file cards in the experimental database the average number of irregularities recorded per print was 101.35. The majority of irregularities that were not true characteristics fell at the edge of the print; they recorded all those places where ridges' came into sight' or 'went out of sight'. Thus a significant proportion of the file data storage requirement is spent in descriomg the edge of the file print.

In practice the edge of the file print is not very important — as the latent mark, invariably shows an area completely within the area of the rolled file print. The edge consequently plays little or no part in the print comparison process, and the edge description serves only to help the topological reconstruction process make sense of the ridge pattern.

For the sake of economy in file size, therefore, the algorithm LM6 was prepared by adapting the reconstruction stage of LM5 slightly. It is adapted in such a way that the reconstruction will invent its own edge topology in the absence of an edge description. The default topology selected is not important; it is only important that the algorithm does something to tie up all the loose ridges around the edge.

The file collection was then pruned substantially by elimination of all of the edge descriptions, and this reduced the average number of coordinate sets per print from 101.35 to 71.35.

The test reported above was then rerun using the algorithm LM6 and the concensed file set. The rankings obtained were exactly the same as before — so a saving of 30% in file data storage was achieved with absolutely no loss of resolution.*

* The pruning operation was not performed on the latent mark data file for two reasons. Firstly, latent mark databases (where these are kept) are tiry in comparison to rolled file print collections, and so storage requirements are not a major concern. Secondly, the edge of a latent mark does play an important part in the comparison process. OPTIONAL USE OF FIFTH CODRDINATE.

Many existing spatial matching algorithms use local ridge direction data as well as X and Y coordinates for each characteristic located. Thus spatial matcning algorithms normally use coordinates of the form (X, Y, θ) where θ shews the direction of ridge flow local to each chacteristic.

Such data was not used in the algorithm LM6 - but could well be incorporated into the topoiogical coordinate data as a fifth coordinate. The use of that fiftn coordinate with the matching algorithm could then be:-

c) as a corrective measure for rotational misorientation.

The benefits of including such a fifth coordinate may not justify the 25% increase in storage space that it would neces¬

sarily entail. DERIVATION OF VECTORS F OR ROLLED PRINT COMPARISON.

Nevertheless there is a significant benefit to be gained from the topological re- construction section of the latent matching algorithm. The data-gathering requirements included the need to track along ridges, in order to find the first event that happened. Although that, in itself, is not a particularly demanding programming task the ability to reconstruct topologies from coordinates renders it unnecessary. A topological code vector representing a horizontal line passing through the core of a loop can be lifted out of the continuity matrix after reconstruct ion. The left half of it (i.e. the part that falls to the left of the core) and the right half will be extracted separately. Each half is extracted by selecting the column of the continuity matrix that corresponds with an imaginary line just to the anticlockwise side of horizontal, (i.e. just below for the left side, and just above for the right side). Amalgamating these two halves, reversing the 'up' and' down' pairs from the right half, gives a single long vector of the required format.

(a) the core point, which was to be on a ridge, is replaced by the central observation point whicn is in a valley. The central observation point will, however, be only fractionally removed from the core in the case of loops and whorls.

(b) the vector has irregularity identifiers rather than ridge-traced distance measures. Consequently the vector comparison algorithm has to be adapted to refer to the appropriate coordinate sets when the time comes to apply the various distance tests.

In an operational system the maximum speed would be obtained by performing topological reconstruction, and vector extraction, at the time each print is introduced to the collection. The extracted 'long' vectors could be stored in a separate file so that they could be used for fast vector comparison without the need to perform topological reconstruction each time. That would obviously increase the data storage requirement per print by the 60 bytes required for such ' long' vectors. The coordinate sets, and topological reconstruction would then only be used when a latent search was being conducted.

* The performance of MATCH4 on such derived vectors has not been tested. This is because of the incredibly time consuming nature of manual encoding according to the latent scheme (up to 1 hour per print for clear rolled impressions). The time for such tests will be after the development of automatic data extraction techniques, when large numbers of prints can be encoded automatically according to the latent scheme, and then have derived vectors extracted after topological reconstruction. IMAGE RETRIEVAL SYSTEM.

There is a significant demand for automated identification systems to be linked with an image-retrieval facility for all the prints in the file collection. The system operator obtains a list of tne highest scoring candidates each time an automated search is conducted — these candidates have then to be checked visually by the fingerprint expert to determine which of them, if any, is the true mate. This visual checking can be done much more easily if the fingerprints can be displayed on a screen, rather than having to be fetched from a cupboard. Much research is currently underway with the aim of finding economical methods for storing the two dimensional pictures (fingerprints) in computer memory so that they can be called up and displayed on the terminal screen.

There are two distinct paths for such research. The first aims to record the original grey-scale data which is output from automatic scanners, with no interpretative algorithms ever being applied to she print (although data compaction techniques will, of course, be used). The second uses interpretative algorithms to identify the ridges and valleys within the grey-scale image, to resolve the picture into a binary (black and white) image, and then finally to reduce the thickness of each ridge to one pixel by a variety of ridge-thinning techniques. What is then stored is sufficient data to enable each thinned ridge segment to be redrawn (i.e. start position, end position, curvature etc.). The data requirements per print are in the order of 2, 000 to 4,000 bytes for compressed grey-scale images, and between 1,000 and 2, 000 bytes for a thinned image.

With the 4-coordirate system used in the latent scheme records, a complete topological and spatial description of the characteristics can be stored in between 300 and 400 bytes. It should therefore be possible to redr a w the fingerprint, in the style of a thinned image, from that data. Firstly topological reconstruction has to be performed, and then the elastic (topological) image has to be 'pinned down' at each characteristic, by reference to their polar coordinate positions contained in the coordinate sets.

The substantial problem in such a process is the business of generating a smooth ridge pattern that accommodates all the pinned points. The problems raised are similar to those in cartography — when a smooth contour map has to be drawn from a finite grid of discrete height (or depth) samplings. One fairly crude reconstruction algorithm was written simply because generation of a picture from topological coordinate sets provides a most satisfying demonstration of the sufficiency of such coordinate descriptions. The algorithm PLOT1 (Appendix C) was written as a Fortran programme: its input was the set of coordinates representing a specified print, and its output was a file of ' QMS-QUIC' instructions for the graphics display facility of a QMS LASERGRAFIX 1200 printer. The algorithm first performed topological reconstruction in the normal manner, and then assigned polar coordinates to every ridge intersection point in such a manner that all the topological irregularities were assigned their own (real) polar coordinates. A series of simple linear smoothing operations are applied, coupled with untangling and gap-filling procedures that make successive small adjustments to the radial distances of all the intersection points that are not irregularities. These processes continue until a certain standard of smoothness is attained. Finally the picture is output as a collection of straight line segments between connected ridge intersection points.

A sample reconstructed fingerprint image is shown in figure 13, together with its descriptive data. The picture is made up of 4, 404 straight line segments. The topology is correct, and each irregularity is properly located: however the intervening ridge paths have suffered some unfortunate spatial distortions. For the sake of comparison, the original print tracing from which the coordinate sets were derived is shown in figure 14 (it has been reduced from 10x to 5x magnification). Detailed comparison of figures 13 and 14 will reveal a few places where the topology appears to have been altered. In fact it has not been altered — but, at this magnification, some ridges appear to have touched when they should not. This tends to occur where the ridge flow direction is close to radial. In such places the untangling subroutine, which moves ridges apart when they get too close together, has not been forceful enough in separating them.

The facility for reconstruction also affords the opportunity to actually see a' default edge-topology'. Figures 17 and 18 show two further reconstructed images of the print in figure 14. The upper picture is the same as figure 13, except for a reduction in magnification (to 2.5x). The lower picture is a reconstruction from the condensed data set for the same print, after all the coordinate sets relating to ridges going 'out of sight' have been deleted. All the loose ends have been tied up by the reconstruction algorithm in a fairly arbitrary, but interesting, way. The lower picture does, of course, show some false ridge structure in areas that were 'out of sight'. However the data storage requirement for the corresponding coordinate sets was only 354 bytes for the edge-free description, as opposed to 526 bytes for the original description.

A topological approach to fingerprint coding according to the invention offers a great deal in terms of improved accuracy and cost-effectiveness. It is also clear that topology based matching algorithms are greatly improved by utilising some spatial information. The power of resolution between mates and non-mates given by the combination of topological and spatial information is vastly superior to that which can be obtained by use of spatial information alone.

What is claimed is:—

Claims

1. In a fingerprint recognition system, apparatus for extracting topological coordinates from a known fingerprint and storing composite sets of extracted topological coordinates in a machine searchable fingerprint database comprising: means for establishing a scanning line, means for selecting a point on said scanning line, means for scanning said known fingerprint in a predetermined scan pattern by progressive movement of said scanning line over said known fingerprint, including each successive topological characteristic in said predetermined scan pattern and determining the scan location data (M) of each said successive topological characteristic relative to said scanning line and scanning pattern, means for determining the number of ridge lines (R) between said predetermined point on said scanning line and each said successive topological characteristic, means for assigning a predetermined type code (T) to each successive topological characteristic, and means for storing said type code (T), scan location data (M) and the number of ridge lines (R) for each known fingerprint in a machine searchable database.

2. The fingerprint recognition system defined in claim 1 including a central computer coupled to said database, one or more inquiry terminals, means for connecting said one or more inquiry terminals to said central computer, each said inquiry terminal including means for extracting from an unknown latent or rolled fingerprint composite sets of topological coordinates comprising: means for establishing a scanning line on said unknown fingerprint, means for scanning said unknown fingerprint in a predetermined scan pattern by progressive movement of said scanning line over said unknown fingerprint, including each successive topological characteristic in said predetermined scan pattern and determining the scan location data (Mu) of each said successive topological characteristic relative to said scanning line and scanning pattern, means for determining the number of ridge lines (Ru) between said predetermined point on said scanning line and each said successive topological characteristic, means for assigning a predetermined type code (Tu) to each successive topological characteristic, and said central computer having means for comparing said topological coordinate sets (Tu, Mu, Ru, ) from an unknown fingerprint with the topological coordinate data in said machine searchable database to identify the known fingerprint corresponding to said unknown print.

3. The system defined in claim 2 including file means storing replicable images of all said known prints and means for retrieving a replicable image of identified known print corresponding to said unknown print.

4. The system defined in claim 2 including means for retri ev ing an image of a known f i ngerprint correspond ing to sai d unknown fingerprint.

5. The system defined in claim 4 including means for displ ayi ng an image of t he f i ngerpri nt retrieved by sai d means for retrieving.

6. The fingerprint recognition system defined in claim 1 including, for each topological irregularity on each known fingerprint, means for determining a distance (D) from each successive topological irregularity to the said predetermined point on said scanning line, means for storing the distance (D) together with the coordinate sets (T, M, R) as composite coordinate sets of the form (T, M, R, D) in said machine searchable database.

7. The fingerprint recognition system defined in claim 6 including a central computer coupled to said database, one or more inquiry terminals, means for connecting said one or more inquiry terminals to said central computer, each said inquiry termi nal incl ud ing means for extract i ng from an unknown l atent or rolled fingerprint, composite sets of topological coordinates comprising: means for establishing a scanning line, means for selecting a predetermined point on said scanning line to be the predetermined point, means for scanning said unknown fingerprint in a predetermined scan pattern by progressive systematic movement of said scanning line over said unknown fingerprint, including each successive topological characteristic in said predetermined scan pattern and determining the scan location data (Mu) of each said successive topological characteristic relative to said scanning line and scanning pattern, means for determining the number of ridge lines (Ru) between said predetermined point on said scanning line and each said successive topological characteristic, means for assigning a predetermined type code (Tu) to each successive topological characteristic, means for determining a distance (Du) from each successive topological irregularity to the said predetermined point on said scanning line, and said central computer having means for comparing said composite topological coordinate sets (Tu, Mu, Ru, Du) from an unknown fingerprint with the composite topological coordinate data sets in said machine searchable database to identify the known fingerprint corresponding to said unknown fingerprint.

8. The fingerprint recognition system defined in claim 7 including means for producing a visual image of an identified known fingerprint corresponding to said unknown fingerprint.

9. The system defined in claim 1 wherein said predetermined scan pattern is defined by rotating a line radially about a predetermined central point, that central point also being designated the predetermined point on the scanning line for the purpose of measuring ridge counts (R), and where said scan location data is the angle (θ) measured from a predetermined reference line.

10. The fingerprint recognition system defined in claim 1 wherein said means for scanning said known fingerprint in a predetermined scan pattern includes means for moving a straight vertical "line" of scan horizontally across the fingerprint pattern, a point on that scanning line which is vertically below the entire fingerprint pattern being 'designated the predetermined point on the line for the purpose of measuring said ridge counts (R), and where said scan location data (M) is the measure of the horizontal movement of said scanning line from a predetermined vertical left hand edge.

11. The fingerprint recognition system defined in claim 1 wherein said means for scanning said known fingerprint in a scan pattern includes means for moving a straight horizontal line vertically over the fingerprint pattern starting from some predetermined bottom edge, a point on that scanning line which is horizontally to the left of the entire fingerprint pattern being designated the predetermined point on the line for the purpose of measuring said ridge counts (R), and where said scan location data (M) is the measure of the vertical movement of said scanning line from a predetermined bottom edge.

12. The fingerprint recognition and retrieval system defined in claim 10, including storage means for storing replicable images of all said known prints and an image retrieval means for retrieving a replicable image from said storage means corresponding to a selected set of said topological coordinates, means for replicating and displaying an image of a print having said selected set of topological coordinates.

13. The fingerprint recognition and retrieval system defined in claim 11, including storage means for storing replicable imagess of all said known prints and an image retrieval means for retrieving a replicable image from said storage means corresponding to a selected set of said topological coordinates, means for replicating and displaying an image of a print having said selected set of topological coordinates.

14. The fingerprint recognition and retrieval system defined in claim 10, including storage means for storing images of all said known prints, means for entering at least a portion of a set of topological coordinates corresponding to an unknown print, means for searching said machine searchable database to find a matching coordinate sets in said database for said portion of a set of topological coordinates, and means for retrieving and displaying said one of said image of a known print corresponding to the matching coordinate set found by said search.

15. The fingerprint recognition and retrieval system defined in claim 11, including means for storing images of all said known prints, means for entering at least a portion of a set of said topological coordinates of an unknown print, means for searching said machine searchable data base to find a match for said portion of a set of topological coordinates, and means for retrieving and displaying said one of said image of a known print corresponding to the matching coordinate set found by said search.

16. The system defined in claim 2 wherein said predetermined scan pattern, both on the unknown and on the known fingerprints, is defined by rotating a line radially about a predetermined central point, that central point also being designated the predetermined point on the scanning line for the purpose of measuring ridge counts (R), and where said scan location data is the angle (θ) measured from a predetermined reference line.

17. The system defined in claim 6 wherein said predetermined scan pattern on the unknown fingerprints, is defined by rotating a line radially about a predetermined central point, said central point being designated the predetermined point on the scanning line for the purpose of measuring ridge counts (R) and also for the purpose of measuring the distances (D), and where said scan location data is the angle (θ) measured from a predetermined reference line.

18. The system defined in claim 7 wherein said predetermined scan pattern, both on the unknown and on the known fingerprints, is defined by rotating a line radially about a predetermined central point, that central point also being designated the predetermined point on the scanning line for the purpose of measuring ridge counts (R) and also for the purpose of measuring the distances (D), and where said scan location data is the angle (θ) measured from a predetermined reference line.

19. The fingerprint recognition system defined in claim 18 including means for displaying an image of an identified known fingerprint corresponding to said unknown print.

20. In a fingerprint recognition system, a method of extracting topological coordinates from a known fingerprint and storing composite sets of extracted topological coordinates in a machine searchable fingerprint database comprising: placing a scanning line on said known fingerprint selecting a point on said scanning line to be the predetermined point, scanning said known fingerprint in a predetermined scan pattern or direction by systematic movement of said scanning line over said known fingerprint, including each successive topological characteristic in said predetermined scan pattern, determining the scan location data (M) of each said successive topological characteristic relative to said scanning line and scanning pattern, determining the number of ridge lines (R) between said predetermined point on said scanning line and each said successive topological characteristic, assigning a predetermined type code (T) to each successive topological characteristic, and storing the type code (T), scan location data (M) and thenumber of ridge lines (R) in a machine searchable database.

21. A method of identifying an unknown fingerprint comprising the method defined in claim 20, including a method for extracting from an unknown fingerprint sets of topological coordinates comprising: placing a scanning line on said unknown fingerprint, selecting a point on said scanning line to be the predetermined point, scanning said unknown fingerprint in a predetermined scan pattern or direction by systematic movement of said scanning line over said unknown fingerprint, including each successive topological characteristic in said predetermined scan pattern and determining the scan location data (M) of each said successive topological characteristic relative to said scanning line and scanning pattern, determining the number of ridge lines (R) between said predetermined point on said scanning line and each said successive topological characteristic, assigning a predetermined type code (T) to each successive topological characteristic, and causing said topological coordinate sets for an unknown print to be compared with the toooiogical coordinate data in saidmachine searchable database to identify the known print corresponding to said unknown print.

22. The method defined in claim 21 including providing an image file of all fingerprints in said machine searchable database, and retrieving from said image file an image of said known fingerprint corresponding to said unknown fingerprint.

23. The method defined in claim 20 including, for eachtopological irregularity on each known fingerprint, determining a distance (D) from each successive topological irregularity to the said predetermined point on said scanning line, storing the distance (D) together with the coordinate sets (T, M, R) as composite coordinate sets, of the form (T, M, R, D) in a machine searchable database.

24. In a fingerprint recognition system the method defined in claim 22 including a method of extracting from an unknown latent or rolled fingerprint composite sets of topological coordinates comprising: placing a scanning line on said unknown fingerprint, selecting a point on said scanning line to be the predetermined point, scanning said unknown fingerprint in a predetermined scan pattern or direction by systematic movement of said scanning line over said unknown fingerprint, including each successive topological characteristic in said predetermined scan pattern and determining the scan location data of each said successive topological characteristic relative to said scanning line and scanning pattern, determining the number of ridge lines (R) between said predetermined point on said scanning line and each said successive topological characteristic, assigning a predetermined type code (T) to each successive topological characteristic, determining a distnce (D) from each successive topological irregularity to the said predetermined point on said scanning line, and comparing said composite topological coordinate sets froman unknown print with the composite topological coordinate data in said machine searchable database to identify the known print corresponding to said unknown print.

25. In a fingerprint recognition system the method defined in claim 24 including a method of displaying an image ofan identified known fingerprint corresponding to said unknown prin

26. The method defined in claim 20 wherein said predetermined scan pattern is defined by rotating a line radiallyabout a predetermined central point, that central point also being designated the predetermined point on the scanning line forthe purpose of measuring ridge counts (R), and where said scan location data is the angle (θ) measured from a predetermined reference line.

27. The method defined in claim 21 wherein said predetermined scan pattern, both on the unknown and on the known fingerprint, is defined by rotating a line radially about a predetermined central point, that central point also being designated the predetermined point of the scanning line for the purpose of measuring ridge counts (R), and where said scan location data is the angle (θ) measured from a predetermined reference line.

28. The method defined in claim 27 including retrievingan image of a known print corresponding to said unknown print.

29, The method defined m claim 28 including displayingan image, at a remote site, of the retrieved fingerprint.

30. The method defined in claim 23 wnerein said predetermined scan pattern on the unknown fingerprints is defined by rotating a line radially about a predetermined central point, that central point also being designated the predetermined point on the scanning line for the purpose of measuring ridge counts (R) and also for the purpose of measuring the distances (D), andwhere said scan location data is the angle (8) measured from a predetermined reference line.

31. The method defined in claim 24 wherein said predetermined scan pattern, both on the unknown and on the known fingerprints, is defined by rotating a line radially about a predetermined central point, that central point also being designated the predetermined point on the scanning line for the purpose of measuring ridge counts (R) and also for the purpose omeasuring the distances (D), and where said scan location data is the angle (9) measured from a predetermined reference line.

32. In a fingerprint recognition system tne method defined in claim 31 including retrieving an image of an identified known fingerprint corresponding to said unknown print.

33. In a fingerprint recognition system, the method defined in claim 32 including displaying an image of the retrieved fingerprint.

34. In a fingerprint recognition system, apparatus for extracting topological coordinates from a known fingerprint and storing composite sets of extracted topological coordinates in a machine searchable fingerprint comprising: means for selecting a central point on said known fingerprint, means for successively measuring the angular orientation (θ) relative to a predetermined reference line on said known fingerprint of each successive toooiogical characteristic in a predetermined direction of rotation, means for determining the number of ridge lines (R) between said central point and each said successive topological characteristic, means for assigning a predetermined type code (T) to each successive topological characteristic, and means for storing the type code (T), angular orientation (θ) and number of ridge lines (R) in a machine searchable database.

35. The fingerprint recognition system defined in claim 34 including means for measuring the radial distance (D) to each said successive topological characteristic from said selected central point and including said radial distance (D) for each successive topological characteristic whereby a set of stored coordinates (T, θ, R, D) provides a complete topological and spatial description of said known fingerprint.

36. The fingerprint recognition system defined in claim 34 including a central computer coupled to said database, one or more enquiry terminals connectaole to said central computer, each said enquiry terminal having a means for extracting from an unknown latent or rolled fingerprint composite sets of topological coordinates comprising: means for selecting a central point on said unknown fingerprint, means for successively measuring the angular orientation (θ) relative to a predetermined reference line on said unknown fingerprint of each successive topological characteristic in a predetermined direction of rotation, means for determining the number of ridge lines (Ru) between said central point and each said successive topological characteristic, means for assigning a predetermined type code (Tu) to each successive topological characteristic, and said central computer having means for comparing said topological coordinate sets from an unknown print with the topological coordinate data in said machine searchable database to identify the known print corresponding to said unknown print.

37. The fingerprint recognition system defined in claim 36 including means for displaying an image of an identified known fingerprint corresponding to said unknown fingerprint.

38. The fingerprint recognition system defined in claim 34 including means for extracting from said unknown fingerprint the distance (Du) between said central point and each successive topological characteristic, and said central computer having means for comparing said composite topological coordinate sets from an unknown print with the composite topological coordinate data in said machine searchable database to identify the known print corresponding to said unknown print.

39. The fingerprint recognition system defined in claim 38 including means for retrieving and displaying an image of said known print corresponding to said unknown fingerprint.

40. The fingerprint recognition system defined in claim 35 including means for topologically reconstructing an image of the known fingerprint corresponding to arty set of topological coordinates stored in the database, and displaying said reconstructed fingerprint at said enquiry terminal.

41. A method of establishing a machine searchable library of fingerprints comprising the steps of: (1) selecting a central point of the fingerprint as a centre of a ridge scan line, (2) relatively moving said ridge scan line to different topological characteristics of said fingerprint for a plurality of ridge lines, (3) assigning a predetermined type code (T) to each said selected topological characteristic located, (4) measuring the location (M) of said ridge scan line from a predetermined first scan line location, (5) counting the number of ridges (R) between said center to the ridge scanned and forming a composite code (T, M, R) for each different topological characteristic, (6) storing each composite code (T, M, R) in a machine searchable database.

42. The method defined in claim 41 in which said ridge scan line is a rotating ridge scan line having said central point as its center of rotation, including measuring the angular orientation (θ) (where M = θ ) and radial distance (D) along said ridge scan line to each said topological characteristic and including each respective measurement as part of a composite code

I (T, M, R, D) stored in said machine searchable database.

43. A method of coding an unknown latent or rolled fingerprint comprising the steps of: (1) locating a central point of the fingerprint as a center of a rotating ridge scan line, (2) relatively rotating said ridge scan line to different topological characteristics of said fingerprint for a plurality of ridge lines, (3) assigning a predetermined type code (T) to each said selected topological characteristics located, (4) measuring the angular orientation (θ) of said ridge scan line from a predetermined first angular orientation, (5) counting the number of ridges (R), between said center to the ridge scanned, and (6) forming a composite code (T, θ, R) for each different topological characteristic on the print.

44. A method of identifying an unknown latent or rolled fingerprint in a rolled fingerprint database comprised of rolled fingerprints coded in topological coordinate set format wherein the ridge characteristics and other irregularities are coded by their type (T), relative angular orientation (θ) about a selected central observation point and from a predetermined line of reference, and the number of ridges (R) from said central feature, comprising the steps of: (1) topologically coding said unknown fingerprint in the same code format as the said rolled fingerprint, (2) topologically reconstructing the rolled fingerprint and the unknown print from the topological coordinate sets for a given rolled fingerprint stored in said database and the topological coordinate set for said unknown print, (3) comparing vectors extracted from such topological reconstruction to obtain an identity of said unknown fingerprint.

45. In a fingerprint recognition system, the method defined in claim 44 including displaying an image o f the retrieval fingerprint.

46. The method as described in claim 44 including a method for extracting from an unknown fingerprint a radial distance (D) for each successive topological characteristic, and forming composite coordinate sets of the type (T, θ, R, D) and comparing said composite topological coordinate sets from an unknown print with the topological coordinate data in said machine searchable database to identify the known print corresponding to said unknown print.

47. The method as described in claim 46 including displaying an image of the retrieved fingerprint.

48. The fingerprint recognition system defined in claim 35 including means for determining the local ridge direction ( λ ) for each said successive topological characteristic and storing said ridge direction ( λ ) with said set of stored coordinates.

49. The method defined in claim 42 including determining the local ridge direction (λ) for each said topological characteristic, and storing said ridge direction (λ) as part of said composite code (T, θ, R, D, λ) stored in said machine searchable database.

50. A method of displaying a replica of a fingerprint, comprising: topologically reconstructing an image of said fingerprint corresponding to a set of topological coordinates of the form (T, M, R, D) stored in a machine searchable database, wherein: (T) is an irregularity type, (M) is the scan location data for the irregularity derived by a moving scan line from a predetermined point, (R) is the number of ridges crossed by said moving scan line from said predetermined point to the irregularity, (D) is the distance from the irregularity to said predetermined point on said scan line, and displaying said reconstructed fingerprint image at an enquiry terminal coupled to said machine searchable database.

51. The method of displaying a replica of a fingerprint as defined in claim 5ø wherein said scan location is derived by scanning a straight scan line across the fingerprint from said predetermined point to each irregularity and said scan location data is a function of the movement of said scan line.

52. The method of displaying a replica of a fingerprint as defined in claim 51 wherein said scan location data (M) is the angle ( θ ) rotatively traversed about a selected central observation point on the fingerprint by a radial scan line from said predetermined point to said irregularity.

53. The method of displaying a replica of a fingerprint as defined in claim 52 wherein said selected central observation point of said radial scan line is off-set from a central core of said fingerprint.

54. The method of displaying a replica of a fingerprint as defined in claim 51 wherein said scan location data (M) is derived by moving a straight vertical line horizontally across said fingerprint pattern, and is the measure of horizontal movement of said vertical line from a predetermined vertical edge thereof.

55. The method of displaying a replica of a fingerprint as defined in claim 51 wherein said scan location data (M) is derived by moving a straight horizontal line vertically across said fingerprint pattern and is the measure of vertical movement of said line from a horizontal edge thereof.

56. A method of displaying a replica of a fingerprint comprising, storing at least one set of topological coordinates of the form (T, M, R, D), topologically reconstructing an image of said fingerprint from said one set of topological coordinates, and displaying said image.

57. Fingerprint coding apparatus for coding an unknown latent or rolled fingerprint comprising the steps of: means for locating a central point of the fingerprint as a center of rotating ridge scan line, means for relatively rotating said riqge scan line to different topological characteristics of said fingerprint for a plurality of ridge lines, means for assigning a predetermined type code (T) to each said selected toooiogical characteristics located, means for measuring the angular orientation (θ) of said ridge scan line from a predetermined first angular orientation, means for counting the number of ridges (R), between said center to the ridge scanned, and means for forming a composite code (T, θ, R) for each different topological characteristic on the print.

58. Apparatus for identifying an unknown latent or rolled fingerprint in a rolled fingerprint database comprised of rolled fingerprints coded in topological coordinate set format wherein the ridge characteristics and other irregularities are coded by their type (T), relative angular orientation (θ) about a selected central observation point and from a predetermined line of reference, and the number of ridges (R) from said central feature, comprising: (1) means for topoiogically coding said unknown fingerprint in the same code format as the said rolled fingerprint, (2) means for topoiogically reconstructing the rolled fingerprint and the unknown print from the topological coordinate sets for a given rolled fingerprint stored in said database and the topological coordinate set for said unknown print, (3) means for comparing vectors extracted from such topological reconstruction to obtain an identity of said unknown fingerprint.

59. Apparatus described in claim 58 including a means for extracting from an unknown fingerprint a radial distance (D) for each successive topological characteristic, and forming composite coordinate sets of the type (T, θ, R, D), and means for comparing said composite topological coordinate sets from an unknown print with the topological coordinate data in said machine searchable database to identify the known print corresponding to said unknown print.

60. A fingerprint recognition and retrieval system comprising: storage means for storing a plurality of sets of topological coordinates for a corresponding plurality of known fingerprints, terminal means for entering a set of topological coordinates for at least a portion of an unknown fingerprint, and means for comparing said set of topological coordinates for at least a portion of an unknown fingerprint with said plurality of sets of topological coordinates corresponding to said plurality of known fingerprints, to locate a match therefor and, means for retrieving a matching known fingerprint.

61. A fingerprint recognition and retrieval system as defined in claim 60 including a oisplay means, and means for reconstructing an image of said matching known fingerprints on said display means.

62. A fingerprint recongition system as defined in claim 60 including means for topologically reconstructing an image of said matching known fingerprint and display means for displaying said image.

63. A method of establishing a machine searchable library of fingerprints comprising the steps of: locating a central reference point and projecting a line in a predetermined direction through the said central point to intersect ridge lines to each side of said centrally located reference point, from the point of crossing of saiα projected line with each ridge, and in a predetermined order, tracing each side of said ridge crossings from said projected line to the first occurring topological event and assigning a type code (T) to the event, and measuring the distance (D) from the projected line crossing to the event, and recording each type code (T) and distance (D) in said predetermined order as a topological data vector in a machine searchable database.

64. A method of searching an unknown latent or rolled fingerprint against a rolled fingerprint database comprised of rolled fingerprints coded in the topological data vector format, comprising the steps of: locat ing a central reference point on sai d unknown fingerpri nt and projecting a line through the said central point to intersect ridge lines to each side of said centrally located reference point, from the point of crossing of said projected line with each ridge, and in a predetermined order, tracing each side of said ridge crossings from said projected line to the first occurring topological event and assigning a type code (T) to the event, and measurinq the distance (D) from the projected line crossing to the event, forming a code vector comprised of said type codes and said distances, and comparing said code vector with such code vectors recorded in the rolled print database in order to identify which known fingerprint in the said file collection corresponds to the said unknown fingerprint.

65. Apparatus for establishing a machine searchable library of fingerprints comprising the steps of: means for locating a central reference point and projecting a line in a predetermined direction through the said central point to intersect ridge lines to each side of said centrally located reference point, means for tracing, in a predetermined order, each side of said ridge crossings from said projected line to the first occurring topological event and assigning a type code (T) to the event, measuring means for measuring the distance (D) from the projected line crossinq to the event, and recording means for recording each type code (T) and distance (D) in said predetermined order as a topological data vector m a machine searchable database.

66. Apparatus for searching an unknown latent or rolled fingerprint against a rolled fingerprint database comprised of rolled fingerprints coded in the topological data vector format, comprising: means for locating a central reference point on said unknown fingerprint and projecting a line through the said central point to intersect ridge lines to each side of said centrally located reference point, tracing means for tracing, in a predetermined order, each side of said ridge crossings from said projected line to the first occurring topological event and assigning a type code (T)to the event, measuring means for measuring the distance (D) from the projected line crossing to the event, coding means for forming a code vector comprised of said type codes and said distances, and comparator means for comparing said code vector with such code vectors recorded in the rolled print database in order to identify which known fingerprint in the said file collection corresponds to the said unknown fingerprint.

67. A method of establishing a machine searchable library of fingerprints comprising the steps of: scanning the fingerprint to obtain a set of topological coordinates, toooiogically reconstructing said fingerorint from said set of topological coordinates