WO2022188120A1

WO2022188120A1 - Event-based vision sensor and method of event filtering

Info

Publication number: WO2022188120A1
Application number: PCT/CN2021/080343
Authority: WO
Inventors: Takao Ishii
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2021-03-12
Filing date: 2021-03-12
Publication date: 2022-09-15
Also published as: CN117044221A

Abstract

Event-based vision sensor (11) and method of event filtering. An event filtering is performed more simply in an event-based vision sensor. The vision sensor (11) comprises a pixel array (21) including an array of pixels, the pixel being configured to detect a light-strength change as an event; a detection R/O unit (23) configured to identify a pixel that generates the event; a read-out unit configured to refer to an event detection state of a target pixel that generated the event and one or more surrounding pixels, the event detection state indicating whether there is a light-strength change; a filtering unit configured to determine whether, based on the state of the surrounding pixel, information that is related to the event is to be outputted from the vision sensor; and a timestamp unit (24) configured to affix a timestamp corresponding to a generation time of the event, to the event that is determined to be outputted by the filtering unit.

Description

EVENT-BASED VISION SENSOR AND METHOD OF EVENT FILTERING

TECHNICAL FIELD

A disclosed embodiment relates to an event-based vision sensor and a method of event filtering.

BACKGROUND ART

Event-based vision sensors are expected to be used in a variety of image processing devices, and such sensors are recently attracting attention as a components to be used in new camera systems of mobile devices, because these sensors provide an event detection function with low power consumption, low latency, high dynamic range, or the like. Vision sensors of this type are also called Dynamic Vision Sensors (DVSs) .

As an example, this type of sensor can perform a feature point extraction for Simultaneous Localization And Mapping (SLAM) or the like, with low power consumption, low latency, high dynamic range, or the like. This type of sensor, for example, allows an indoor navigation system with high traceability to be implemented in a mobile device. This type of sensor is also appropriate for fast detection of a moving object, as such a sensor can reconstruct high-speed movies and high-resolution images, compensate for motion blur, and perform an inter-frame image interpolation while allowing it to exceed a frame rate limit of a conventional frame-based image sensor.

An event-based vision sensor outputs an event as a light-strength change resulting from a movement of an object to be shot (object) or the like. In this regard, it is different from a usual optical sensor that outputs a signal corresponding to light strength for each pixel. The light-strength change is detected based on a change in photo current through a photodiode. It is known in the art that this change of the photo current tends to be influenced from noise in a pixel circuit. If the noise influence is large, it may be ambiguous as to whether the event is true or not. On the other hand, if a flickering object occupies a large part of a field of view (FOV) , a light-strength changes in this large part of the FOV, and thus outputting many event signals rapidly becomes a necessity. Because of these phenomena, a data transmission bandwidth and an operation process load in an application processor may become quite large.

Some of Technologyattempts to remove a noise event and a redundant event by using a dedicated in-pixel processing unit provided in a pixel circuit. A signal generated in the in-pixel processing unit enables or disables an event signal of a small group comprising adjacent pixels (e.g., 4 pixels) to be sent out, based on states of the adjacent pixels. Thereby, it is possible to make the event signal transmission appropriate. The state of each pixel may be represented by one of three states: ON event, OFF event, or NO event.

However, in this technology, it is necessary to divide the pixel array into the small group comprising a predetermined number of pixels, and implement the additional circuit, which is dedicated to the small group, in the pixel circuit. Also, such a circuit is complicated. These are not preferable in terms of device miniaturization and reduction of operation process load.

Some of Technologyattempts to remove a noise event being outputted from a sensor, based on their time-space correlations between events. The time-space correlation between events is calculated in an event signal processor (ESP) . In this technology, the noise event is removed by grouping a plurality of event signals based on their timestamp values, calculating the time-space correlations of respective groups, and determining that an event with a low time-space correlation is noise. For example, it is possible to remove an event that is not on an outline by determining the outline of an object based on the time-space correlation.

However, when calculating such a time-space correlation, a timestamp, location data of a pixel, and information such as state are required for each pixel. Several frame memories are also required for timestamp mapping an image area. Such complicated and numerous operation processes are not preferable in terms of low power consumption and device miniaturization.

SUMMARY OF THE INVENTION

PROBLEM TO BE SOLVED BY THE INVENTION

It is a general object of an aspect of an embodiment to perform event filtering more simply in an event-based vision sensor.

It is a more specific object of an aspect of an embodiment to perform event filtering more simply, without adding any complicated circuit construction, in an event-based vision sensor.

It is another more specific object of an aspect of an embodiment to perform event filtering more simply, without requiring a complicated operation process, in an event-based vision sensor.

MEANS TO SOLVE THE PROBLEM

In accordance with a first aspect of an embodiment, an event-based vision sensor is provided. The vision sensor comprises:

a pixel array including an array of pixels, the pixel being configured to detect a light-strength change as an event;

a detection unit configured to identify a pixel that generates the event;

a read-out unit configured to refer to an event detection state of a target pixel that generated the event and one or more surrounding pixels, the event detection state indicating whether there is a light-strength change;

a filtering unit configured to determine whether, based on the state of the surrounding pixel, information that is related to the event is to be outputted from the vision sensor; and

a timestamp unit configured to affix a timestamp corresponding to a generation time of the event, to the event that is determined to be outputted by the filtering unit.

In this embodiment, whether information that relates to the event is to be outputted from the vision sensor is determined based on the state of the surrounding pixel. The surrounding pixel is a pixel that exists around the target pixel that detected a light-strength change. The state in this context is the thing which indicates whether a light strength is changed (it may include increase or decrease) , and this does not need to include any timestamp. In accordance with an aspect of an embodiment, it is possible to perform event filtering more simply in an event-based vision sensor.

In accordance with an aspect of an embodiment, the read-out unit may include:

a row selection unit configured to enable states of pixels that belong to a row of the target pixel and an adjacent row adjacent to said row to be read; and

a column R/O unit configured to enable states of pixels that belong to a column of the target pixel and an adjacent column adjacent to said column to be read.

In this embodiment, the detection unit for identifying a pixel which generates the event detected a light-strength change may include arbiter type encoders which are provided in a row direction and a column direction. The row selection unit and the column R/O unit that enable the pixel state to be read can be configured simply. The read-out unit can read a state indicating whether a light strength is changed, from a surrounding pixel of the target pixel, by such a simple configuration.

In accordance with an aspect of an embodiment, the pixel array may be scanned row-by-row or column-by-column, sequentially,

wherein the read-out unit includes a buffer memory storing states of pixels that belong to a plurality of sequential rows or columns, and

wherein the filtering unit refers to the states of the target pixel and the surrounding pixels by accessing the buffer memory, and performs a filtering process.

In this embodiment, the detection unit for identifying a pixel that generated the event may be implemented by a scan scheme.

In accordance with an aspect of an embodiment, the filtering unit may determine not to output the information that relates to the event from the vision sensor when a predetermined number of surrounding pixels that surround at least in part the target pixel does not detect a light-strength change.

Generally, a light-strength change for a moving object (i.e., an event generation) may be generated in contiguous pixels. If the light-strength change is not generated in the surrounding pixels surrounding the target pixel in which the light-strength change is generated, the light-strength change of the target pixel may be considered to be noise. In this embodiment, it is possible to remove the noise event simply, based on that a predetermined number of surrounding pixels that surround at least in part the target pixel does not detect a light-strength change.

In accordance with an aspect of an embodiment, the filtering unit may determine whether the predetermined number of surrounding pixels detect a light-strength change, based on a logical OR of logical levels indicating respective states of the predetermined number of surrounding pixels.

In this embodiment, it is possible to determine whether the predetermined number of surrounding pixels detect a light-strength change, based on a simple logical operation of logical levels indicating respective states of the surrounding pixels.

In accordance with an aspect of an embodiment, the filtering unit may determine not to output the information that relates to the event from the vision sensor when a predetermined number of surrounding pixels that surround at least in part the target pixel detect light-strength changes, with polarity equal to that of the target pixel.

If an object to be shot is like an object which is flickering in a large part of a field of view (FOV) , a pixel indicating a light strength is generated in this large part of the field of view (FOV) . However, in terms of tracking a movement of such an object, it is not necessary that all pixels, which detected the light-strength change, output their information which relates to the events. For example, supposing that a rule is predetermined that if the same polarity events indicate a closed outline, the same polarity event may be generated also inside of the outline. In this case, it is possible to communicate a situation to others that events are generated in a whole area surrounded by an outline even if information which relates to the event of pixels inside of the object, which indicate the light-strength change, are omitted. In this embodiment, it is possible to remove a redundant event, based on whether a predetermined number of surrounding pixels that surround at least in part the target pixel detects light-strength changes, with polarity as the target pixel.

In accordance with an aspect of an embodiment, the filtering unit may determine whether the predetermined number of surrounding pixels detect light-strength changes, with polarity equal to that of the target pixel, based on

(a) a logical product of logical levels indicating respective states of the predetermined number of surrounding pixels that detected a light strength increase, or

(b) a logical product of logical levels indicating respective states of the predetermined number of surrounding pixels that detected a light strength decrease.

In this embodiment, it is possible to determine whether the predetermined number of surrounding pixels detect light-strength changes, with polarity equal to that of the target pixel, based on a simple logical operation of logical levels indicating respective states of the surrounding pixels.

In accordance with an aspect of an embodiment, after the detection unit identifies a pixel that generates the event, a state of the target pixel may be maintained at least until the read-out unit refers to the event detection state.

In this embodiment, when reading the state of the surrounding pixel, the state of the target pixel is maintained, a simultaneity of the states of the target and surrounding pixels is maintained, thereby stabilizing the event filtering process.

In accordance with an aspect of an embodiment, the surrounding pixels may include at least 8 pixels surrounding the target pixel circularly.

With regard to this embodiment, in terms of performing event filtering more simply, the number of pixels being included as the surrounding pixel is not limited to a specific numerical value. In terms of performing event filtering more simply and effectively, as one of the examples, it is preferable that the surrounding pixels are the pixels surrounding the target pixel circularly.

In accordance with an aspect of an embodiment, an imaging device is provided, the imaging device may include:

the vision sensor as claimed in claim 1; and

a processor for processing an image based on the information that relates to an event being outputted from the vision sensor.

In accordance with a second aspect of an embodiment, a method of event filtering to be performed in a vision sensor being an event-based vision sensor is provided. The method comprises:

identifying, by a detection unit, a pixel that generates the event in a pixel array including an array of pixels, the pixel being capable of detecting a light-strength change as an event;

referring to, by a read-out unit, an event detection state of a target pixel that generated the event and a surrounding pixel, the event detection state indicating whether a light strength is changed;

determining, by a filtering unit, whether information that relates to the event is to be outputted from the vision sensor, based on the state of the surrounding pixel; and

affixing, by a timestamp unit, a timestamp corresponding to a generation time of the event, to the event that is determined to be outputted by the filtering unit.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 is a block diagram indicating an exemplary imaging device in accordance with an embodiment.

Fig. 2 is a block diagram indicating an exemplary event-based vision sensor in accordance with an embodiment.

Fig. 3 is a flow chart indicating an exemplary method of event filtering in accordance with an embodiment.

Fig. 4 is a diagram for use in describing an operation of an event-based vision sensor which is implemented in an arbiter scheme.

Fig. 5 is a diagram indicating an exemplary circuit configuration of a row selection unit 42R and a column R/O unit 42C illustrated in Fig. 4.

Fig. 6 is a diagram for use in describing an exemplary noise event filtering in accordance with an embodiment.

Fig. 7 is a diagram for use in describing an exemplary redundant event filtering in accordance with an embodiment.

Fig. 8 is a diagram indicating an exemplary location relation between a target pixel and surrounding pixels.

Fig. 9 is a block diagram indicating an example in which an event-based vision sensor is implemented in a scan scheme.

Fig. 10 is a diagram indicating a relation between a state E and an output E _o, in which the E is stored row-by-row in a line/frame buffer 93 illustrated in Fig. 9, and the E _o is a logical operation result which can be used for the event filtering.

Fig. 11 is a diagram indicating elements in a pixel circuit.

Fig. 12 is a diagram indicating an exemplary architecture of a mobile terminal.

Mode for Carrying Out the Invention

Hereinafter, embodiments are described with reference to attached drawings. Through the drawings, the same reference number or reference symbol is assigned to a similar element. In a block diagram, only elements which particularly relate to the embodiment are described, and thus other elements may actually exist. A partition of respective elements in a block diagram is merely indicated for convenience of explanation. Respective elements may be physically separated as shown in the partition, or may be implemented according to a partition different from the indicated partition under the condition that functions described here can be provided.

Fig. 1 is a block diagram indicating an exemplary imaging device 10 in accordance with an embodiment. For example, the imaging device 10 is a camera. The camera may be a single camera device, a camera which is attached to any other device, or a camera which is integrated into any other device. The imaging device 10 includes an event-based vision sensor 11, and a processor 12.

The event-based vision sensor 11 may be called a Dynamic Vision Sensor (DVS) . The vision sensor 11 may be used for a variety of applications, such as but not limited to a feature point extraction for Simultaneous Localization And Mapping (SLAM) , a fast detection of a moving object, or the like.

The event-based vision sensor 11 can output information which indicates a detected light-strength change caused by a moving object to be shot (such information may be called information that relates to an event, or an event signal) . In the embodiment described hereinafter, whether an event signal of a pixel which detected a light-strength change is actually outputted from the vision sensor 11 is determined based on a state of a surrounding pixel. The light-strength change is detected based a change in a current through a photodiode. The light-strength change may be represented by one bit, two bits, or the like depending on an application which the vision sensor 11 is used. For example, when it is indicated whether a light-strength is changed, the light-strength change may be represented by one bit (2-state representation) . When it is indicated that a light-strength is increased (ON event) , a light-strength is decreased (OFF event) , or a light-strength is not changed (NO event) , the light-strength change may be represented by 2 bits (3-state representation) . The specific number of states and the number of bits to represent a light-strength change may be configured appropriately depending on an application.

The processor 12 may perform an image process and other operations on an event signal received from the event-based vision sensor 11. As one of the examples, the processor 12 may be a general processor or a dedicated processor. For example, the processor 12 may be a dedicated image signal processor (ISP) .

Fig. 2 is a block diagram indicating an exemplary event-based vision sensor 11 in accordance with an embodiment. The event-based vision sensor 11 includes a pixel array 21, a controller 22, a detection R/O unit 23, a timestamp unit 24, and an event signal output unit 25.

The pixel array 21 may be formed by an array of pixels. The array may be configured in a form of n-row and m-column matrix, or may be configured to form another shape. Each of the pixels may detect as an event, a change of a strength of light received from an object to be shot.

The controller 22 may control a variety of operations in the event-based vision sensor 11.

The detection R/O unit 23 may identify a pixel which generates the event. The detection R/O unit 23 may also refer to an event detection state of a target pixel which generated the event and a surrounding pixel (the event detection state indicates that there is a light-strength change) . The state indicating the light-strength change, may be represented by, for example, 2-state representation which indicates whether the light-strength is changed, or may be represented by 3-state representation or another representation.

The timestamp unit 24 may affix a timestamp corresponding to a generation time of the event, to the event which is determined to output from the vision sensor 11 by the controller (which is the controller when it functions as the filtering unit) . The timestamp is included in the event signal which is outputted from the vision sensor 11, when the event signal of the target pixel is actually outputted from the vision sensor 11.

The event signal output unit 25 outputs the event signal including the timestamp, to an element external to the sensor, when it is determined that the event signal of the target pixel is to be outputted from the vision sensor 11. The element external to the sensor may be, for example, a processor 12 as shown in Fig. 1.

In addition to the possibility that an object to be shot, which causes light-strength in a pixel to change, may move in a space, there is a possibility that the strength of light emitted from the object to be shot may be changed by the object itself. It is expected that the events generated in the pixels of a critical part in an image, such as an outline of a moving object or a flickering object, are detected as spatially contiguous events. Thus, the state of the surrounding pixel which exists around the target pixel may be a good indicator to determine whether the event is a meaningful event. The controller 22 may function as a filtering unit which determines whether information which relates to the event is to be actually outputted from the vision sensor 11, based on the state of the surrounding pixel.

The state of the surrounding pixel may be represented by only a few bits in the 2-state representation, the 3-state representation, or the like. In this regard, this is different from information which is represented by many bits with a time stamp and a location coordinate. The controller 22 may perform the event filtering simply based on the state of the surrounding pixel, which is represented by a few bits. Moreover, the controller 22 may perform the event filtering before actually sending the event signal of the target pixel from the vision sensor 11. Therefore, a bandwidth for transmission from the event-based vision sensor 11 to the processor 12 may be decreased.

Next, an operation of the event-based vision sensor 11 is described with reference to Fig. 3 and Fig. 4. Fig. 3 is a flow chart indicating an exemplary method of event filtering in accordance with an embodiment. Fig. 4 is a diagram for use in describing an operation of an event-based vision sensor 11. The flow chart starts at step S31 in Fig. 3. It is assumed that in step S32 a pixel detects a change of a strength of light received from an object to be shot, and this pixel is referred to “atarget pixel (P _fired) ” . The pixels are included in a pixel array which is configured in a form of n-row and m-column matrix. In Fig. 4, the target pixel is indicated at (i, j) in the pixel array. The letter i may be represented by an integer equal to or more than 0 and equal to or less than (n-1) . The letter j may be represented by an integer equal to or more than 0 and equal to or less than (m-1) . Note that in step S32 the address of (i, j) has not been identified yet in the controller 22.

In step S33, the detection R/O unit 23 (Fig. 2) identifies a pixel that generates the event in the pixel array 21. The step S33 includes step S331 which determines a row address of a target pixel, and step S332 which determines a column address of the target pixel. Operations in the step S331 and the step S332 are described with reference to Fig. 4. In an example illustrated in Fig. 4, the event-based vision sensor 11 is implemented in an arbiter scheme. The arbiter scheme is a scheme which determines 2 dimensional coordinates of a pixel after a light-stage-change occurs in a pixel. In contrast, a scan scheme described later is a scheme in which all pixels are scanned row-by-row (or column-by-column) in a certain cycle.

The event-based vision sensor 11 illustrated in Fig. 4 includes the pixel array 21, the controller 22, the timestamp unit 24, the event signal output unit 25, a row arbiter/encoder 41R, a column arbiter/encoder 41C, a row selection unit 42R, and a column R/O unit 42C. With regard to a pixel array 21, a controller 22, a timestamp unit 24, and an event signal output unit 25, they have already been described in relation to Fig. 2, and thus duplicated descriptions will not be repeated. With regard to the pixel selection and the state reading, a row arbiter/encoder 41R, a column arbiter/encoder 41C, a row selection unit 42R, and a column R/O unit 42C may correspond to the detection R/O unit 23 illustrated in Fig. 2.

A target pixel P _fired which detected a change of a strength of light received from an object to be shot notifies the row arbiter/encoder 41R that the light-strength change is generated, by a row request signal (S32) . The row arbiter/encoder 41R confirms that the pixel which detected the light-strength change belongs to the (i) th row, transmits an acknowledgement signal to pixels belonging to the (i) th row, and enables the pixels in the row. The row arbiter/encoder 41R notifies the controller 22 that address information of the target pixel P _fired is the (i) th row (S331) .

When the target pixel P _fired receives the acknowledgement signal, it notifies the column arbiter/encoder 41C that the light-strength change is generated, by a column request signal. After processing an event data which is outputted as a column request signal, the column arbiter/encoder 41C confirms that the pixel which detected the light-strength change belongs to the (j) th column, transmits an acknowledgement signal to pixels belonging to the (j) th column, thereby resetting the output of the pixel. After resetting, the detection of a next event can be begun. The column arbiter/encoder 41C notifies the controller 22 that the (j) th column is address information of the target pixel P _fired (S332) . In this way, the controller 22 can determine that the location of the target pixel P _fired is (i, j) . The timestamp unit 24 may determine a time stamp indicating that the light-strength change occurs in the target pixel P _fired, according to an indication from the controller 22.

There may exist simultaneously two or more pixels, each of which detected a change of a strength of light received from an object to be shot. In the step S331, the column arbiter/encoder 41R may receive column request signals indicating light-strength changes, from a plurality of pixels. In this case, the column arbiter/encoder 41R may select one of the rows in some way, and then may perform the operations in step S331 and S332. For convenience of explanation, after a row address is determined, a column address is determined. But, the order may be reversed or simultaneous. In addition, the operation of step S33 which determines the address information of the target pixel P _fired may be referred to as a “handshake protocol” .

In step S34, the detection R/O unit 23 (Fig. 2) refers to an event detection state of the target pixel P _fired which generated the event and a surrounding pixel, the event detection state indicates whether a light strength is changed. The step S34 includes a step S341 which enables the surrounding pixel and a step S342 which reads a state of the surrounding pixel. The operations in the step S341 and the step S342 are described with reference to Fig. 4.

The row selection unit 42R has a function that enables states of pixels that belong to a row (i) of the target pixel P _fired and an adjacent row (i-1, i+1) adjacent to said row (i) to be read. The column R/O unit 42C has a function that enables states of pixels that belong to a column (j) of the target pixel P _fired and an adjacent column (j-1, j+1) to said column (j) to be read.

At the end of the step S33, it has been found that the target pixel P _fired is indicated at (i, j) . In the step S341, the row selection 42R selects and enables the (i) th row. The column R/O unit 42C selects and enables the (j-1) th column and the (j+1) th column, and read states of pixels located at (i, j-1) and (i, j+1) .

Next, the row selection 42R selects and enables the (i-1) th row adjacent to the (i) th row. The column R/O unit 42C selects and enables the (j-1) th column, the (j) the column and the (j+1) th column, and read states of pixels located at (i-1, j-1) , (i-1, j) , and (i-1, j+1) .

Similarly, the row selection 42R selects and enables the (i+1) th row adjacent to the (i) th row. The column R/O unit 42C selects and enables the (j+1) th column, the (j) the column and the (j+1) th column, and read states of pixels located at (i+1, j-1) , (i+1, j) and (i+1, j+1) .

In this way, it is possible to obtain the states of 8 pixels (surrounding pixels) which exist around the target pixel P _fired. When the pixel state is represented as 2-state, 1 bit for 1 pixel is required to indicate a state. Thus, the number of bits required to indicate all states of the target pixel and the surrounding pixels is 9 (=1 x 9) bits. When 2 bits per 1 pixel are used to represent a pixel sate as 3-state, the number of bits required to indicate all states of the target pixel and the surrounding pixels is 18 (=2 x 9) bits. For convenience of explanation, although the column R/O unit 42C reads the states of at most 3 pixels row-by-row, it is also possible to simultaneously read the states from a plurality of rows. In addition, the illustrated number of bits is merely one example, and thus a different number of bits may be required depending on an implementation.

Fig. 5 shows an exemplary circuit configuration of the row selection unit 42R and the column R/O unit 42C illustrated in Fig. 4. As described above, when the row arbiter/encoder 41R receives a row request signal from a target pixel P _fired, it returns a row acknowledgement signal, accordingly (S33) . For example, when the target pixel P _fired belongs to the (i) th row, the row arbiter/encoder 41R returns a row acknowledgement signal (RowAck_i) , accordingly. The row selection unit 42R responds to this row acknowledgement signal (RowAck_i) , and indicates which one of three rows (i-1, i, i+1) is selected and enabled according to a control signal. In the example illustrated in Fig. 5, the control signal is transmitted by indicating 3 logical levels by 3 control lines which extend vertically in Fig 5.

For example, it is assumed that a row acknowledgement signal (RowAck_i) is received for the (i) th row. For example, if the control signal is sequentially (H, L, L) from left to right, a row selection signal (RowAckOut_i-1) which selects the (i-1) th row is outputted because an AND circuit A outputs H. H represents a logical high level, whereas L represents a logical low level.

For example, if the control signal is sequentially (L, H, L) from left to right, a row selection signal (RowAckOut_i) which selects the (i) th row is outputted because an AND circuit B outputs H.

For example, if the control signal is sequentially (L, L, H) from left to right, a row selection signal (RowAckOut_i+1) which selects the (i+1) th row is outputted because an AND circuit C outputs H.

Similar to a row direction, the column arbiter/encoder 41C receives a column request signal from a target pixel P _fired, and then returns a column acknowledgement signal, accordingly (S33) . The column R/O unit 42C reads the state of the surrounding pixel which exists around the target pixel P _fired, from the pixels which belong to the row which is enabled by the column selection unit 42R, and outputs it to the controller 22. For example, if the target pixel P _fired belongs to the (j) th column, the column arbiter/encoder 41C returns an column acknowledgement signal (ColAck_j) , accordingly. The column acknowledgement signal (ColAck_j) is received by the column R/O unit 42C, and entered to AND circuits corresponding to respective columns.

It is assumed that a surrounding pixel (*, j-1) which belongs to the (j-1) th column indicates a light-strength change. In this case, the surrounding pixel (*, j-1) transmits a column request signal (ColReqOn (OFF) _j-1) . Herein according to the change is positive or negative, ColReqOn_j-1 or ColReqOFF_j-1 outputs H. Thus, because an output of the AND circuit P becomes H, a line B1 which extends horizontally in the Fig. 5 becomes a predetermined electrical potential, and it is notified to the controller that a light-strength change occurs at the surrounding pixel (*, j-1) . Note that “*” represents a row which is selected by the row selection unit 42R.

It is assumed that a surrounding pixel (*, j) which belongs to the (j) th column indicates a light-strength change. In this case, the surrounding pixel (*, j) transmits a column request signal (ColReqOn (OFF) _j) . Herein according to the change is positive or negative, ColReqOn_j or ColReqOFF_j outputs H. Thus, because an output of the AND circuit Q becomes H, a line B2 which extends horizontally in the figure 5 becomes a predetermined electrical potential, and it is notified to the controller that a light-strength change occurs at the surrounding pixel (*, j) .

It is assumed that a surrounding pixel (*, j+1) which belongs to the (j+1) th column indicates a light-strength change. In this case, the surrounding pixel (*, j+1) transmits a column request signal (ColReqOn (OFF) _j+1) . Herein according to the change is positive or negative, ColReqOn_j+1 or ColReqOFF_j+1 outputs H. Thus, because an output of the AND circuit R becomes H, a line B3 which extends horizontally in the figure 5 becomes a predetermined electrical potential, and it is notified to the controller that a light-strength change occurs at the surrounding pixel (*, j+1) .

As shown in Fig. 5, the column selection unit 42R and the column R/O unit 42C that are for reading the state of the surrounding pixel may be implemented by simple logical circuits. Specifically, the row selection unit 42R may be implemented by providing three AND circuits and one OR circuit having three inputs, for each of the rows. The column R/O unit 42C may be implemented by providing three AND circuits and three transistors, for each of the columns. These specific logical circuit configurations are only some examples, and thus another configuration may be implemented if necessary. For example, if the surrounding pixels cover a range of 5 x 5 (such an example will be described later with reference to Fig. 8) , the row selection unit 42R may be implemented by providing five AND circuits and one OR circuit having 5 inputs, for each of the rows. The column R/O unit 42C may be implemented by providing five AND circuits and five transistors, for each of the columns. In addition, the column R/O unit 42C outputs 3-bit event state as the states of three pixels each of which indicates 2-state. For example, if each of pixel indicates 3-state, the states of the 3 pixels may be outputted as 6-bit event state. As mentioned above, the number of states of a pixel and the number of bits may be configured appropriately depending on an application.

In the step S35 in Fig. 3, the controller 22 determines whether information which relates to the event is to be outputted from the vision sensor 11, based on the state of the surrounding pixel. The step S35 includes step S351 which determines whether the event signal transmission is appropriate. For example, in step S351, if it is determined that a light-strength change which is detected at a target pixel P _fired is noise, an event signal is not to be outputted to the outside the sensor. In this case, the event signal for the target pixel P _fired is not outputted from the vision sensor 11 to the outside (step S352) . In addition, even if the light-strength change which was detected in the target pixel P _fired is not noise, when the same light-strength changes occur around there, information values of their event signals are low. Such information is redundant even if it is outputted. Also, in this case, the event signal for the target pixel P _fired is not outputted from the vision sensor 11 to the outside (step S352) . The manner in which a determination is made as to whether the event signal transmission is appropriate will be described later with reference to Fig. 6 and Fig. 7.

In the step S351 of FIG. 3, if the controller 22 determines that the event signal transmission is appropriate, the timestamp unit 24 affixes a timestamp corresponding to a generation time of the event, to the event which is determined to output. The event signal output unit 25 (Fig. 2) actually outputs the event signal including the time stamp of the target pixel P _fired, to the processor 12 (Fig. 1) (step S353) . With regard to the target pixel P _fired, the flow chart illustrated in Fig. 3 ends at step S36.

Generally, a light-strength change (i.e., a generation of an event) for a moving object may occur in contiguous pixels. If no light-strength changes occur in surrounding pixels which surround the target pixel in which a light-strength change occurs, it may be considered that the light-strength change of the target pixel is noise. Particularly, if no light-strength change occurs in surrounding pixels which surround the target pixel circularly, the light-strength change of the target pixel tends to be noise. In the following example, if the surrounding pixels which surround the target pixel circularly do not detect any light-strength change, it is determined that the light-strength change in the target pixel is noise. Thus, with regard to the target pixel, the event signal transmission to the outside the sensor is not performed.

Fig. 6 is a diagram for use in describing an exemplary noise event filtering in accordance with an embodiment. On the left side in Fig. 6, light-strength changes occur in 8 pixels indicated by reference number 61, and in 3 pixels indicated by

reference numbers

62, 63, and 64. The letter “p” represents the ON event, the letter “n” represents the OFF event, and “plain” represents a NO event. With regard to the 8 pixels indicated by reference number 61, since a light-strength change occurs in at least one surrounding pixel of the pixels surrounding circularly, it is determined that the light-strength change in the target pixel is not noise. In contrast, 8 surrounding pixels which surround the pixel 62 circularly do not indicate any light-strength change, and this falls into a situation indicated by reference number 65. Thus, it may be determined that the light-strength change in the pixel 62 is noise. Similarly, since the light-strength changes in the

pixels

63, 64 fall into a situation indicated by reference number 66, it may be determined to be noise. As a result, among 11 pixels 61, 62-64, pixels for which their event signals are transmitted are only 8 pixels indicated by reference number 61 (on right side in Fig. 6) .

Next, if an object to be shot is like an object which is flickering in a large part of a field of view (FOV) , pixels indicating light-strength changes occur in this large part of the field of view (FOV) . However, in terms of tracking a movement of such an object, it is not necessary that all pixels, which detected the light-strength change, output their event signals. For example, supposing that a rule is predetermined that if the same polarity events indicate a closed outline, the same polarity event may be generated also inside of the outline. In this case, it is possible to communicate a situation to others that events are generated in a whole area surrounded by an outline even if event signals of pixels inside of the object, which indicate the same light-strength change, are omitted. In the following illustrated example, when a predetermined number of surrounding pixels which surround at least in part a target pixel detect events, with polarity equal to that of the target pixel, the event in the target pixel is determined to be redundant. As a result, with regard to the target pixel, it can be excluded from the event signal transmission to outside the sensor.

Fig. 7 is a diagram for use in describing an exemplary redundant event filtering in accordance with an embodiment. In this example, many ON events indicated by “p” occur. Supposing that a rule is predetermined that if the same polarity events indicate a closed outline, the same polarity event may be generated also inside the outline. On the left side in Fig. 7, for example, pixels which are located in (row, column) = (2, 4) , (2, 5) , (2, 6) or the like fall into a situation represented by reference number 75. With regard to these pixels, even if event signals are not outputted, it is possible to communicate a situation to outside the sensor that the same polarity events as around there has occurred at the target pixel, based on the rule. As a result, it is determined that many events which belong to the inside of the outline are redundant events, with regard to an event which falls into the situation represented by

reference number

75 or 76, and thus an event signal transmission to the outside of the sensor is not performed (on right side in Fig. 7) . In this example, among 78 pixels in which the light-strength changes occur, it is possible to remove 33 pixels from the event signal transmission.

In the illustrated examples with reference to Fig. 4, Fig. 5, and so on, the surrounding pixels are 8 pixels surrounding a target pixel circularly. Embodiments are not limited to such an example, and thus various locations may be considered as locations of the surrounding pixels on which an event filtering is based. Fig. 8 shows multiple examples (81-88) of the location relationship between the target pixel and the surrounding pixels.

In an example represented by 81, the surrounding pixels are 8 pixels which cover the target pixel with a 3 x 3 range. In an example represented by 82, the surrounding pixels are 24 pixels which cover the target pixel with a 5 x 5 range. In an example represented by 83, the surrounding pixels are 48 pixels which cover the target pixel with a 7 x 7 range.

In an example represented by 84, the surrounding pixels are 4 pixels adjacent to the target pixel vertically and horizontally. In an example represented by 85, the surrounding pixels are 2 pixels adjacent to the target pixel horizontally. In an example represented by 86, the surrounding pixels are 1 pixel adjacent to the target pixel upwardly, and 2 contiguous pixels downwardly. In an example represented by 87, the surrounding pixels are 3 pixels which belong to the first quadrant. In an example represented by 88, the surrounding pixels are 3 pixels which belong to the third quadrant and 1 pixel which belongs to the first quadrant. Examples illustrated in Fig. 8 represent only some of the possible examples, and thus it is possible that pixels which are in a variety of locations (not illustrated) may be set as surrounding pixels. It is preferable to set a 3 x 3 range represented by 81 as surrounding pixels, in terms of setting the surrounding pixels on which a filtering is based, simply and effectively.

In the examples described with respect to Fig. 4 and Fig. 5, the event-based vision sensor 11 are implemented in an arbiter scheme. The event-based vision sensor 11 according to one embodiment may be implemented in a scan scheme, instead an arbiter scheme. In the scan scheme, all pixels are scanned row-by-row (or column-by-column) in a certain period. Fig. 9 is a block diagram indicating an example in which an event-based vision sensor is implemented in a scan scheme.

The event-based vision sensor 11 illustrated in Fig. 9 includes a pixel array 21, a controller 22, a timestamp unit 24, an event signal output unit 25, a row scanner 91, a column R/O unit 92, and a line/frame buffer 93. With regard to the pixel array 21, the controller 22, the timestamp unit 24, and the event signal output unit 25, they have already been described with respect to Fig. 2, Fig. 4, and Fig. 5, and thus duplicated descriptions will not be repeated. With regard to the pixel selection and the state reading, the row scanner 91, the column R/O unit 92, and the line/frame buffer 93 illustrated in FIG. 9, they may correspond to the detection R/O unit 23 illustrated in Fig. 2.

The row scanner 91 sequentially scans the pixel array 21 row-by-row in a certain cycle, and enables all pixels at a time which belong to the selected row. This scan is performed whether a light-strength change occur or not at the pixel. The pixel in which the light-strength change occurs transmits a column request signal to the column R/O unit 92. Thereby, the column R/O unit 92 can read the states of all pixels which belong to the enabled row. The line/frame buffer 93 stores the states of a predetermined plural number of rows, among the states of the pixels which are read by the column R/O unit 92. Although the number of rows to store the states may be changed appropriately depending on an application, as one of the examples, it is 3 or more rows. For convenience of explanation, the scan is performed row-by-row. However, the distinction of rows and columns is relative, and thus the scan may be performed column-by-column.

In the examples illustrated in Fig. 4 and Fig. 5, after an address (i, j) of an target pixel is identified, with regard to each of 3 rows (i-1, i, i+1) , the states are read from columns (j-1, j, j+1) which fall within surrounding pixels. In contrast, in the example illustrated in Fig. 9, the states of the pixels are stored in the line/frame buffer 93. Thus, the controller 22 and the line/frame buffer 93 can perform an event filtering by the stored pixel states. For example, if one of pixels indicates an light-strength change, the controller 22 and the line/frame buffer 93 may perform the event filtering by doing operations of steps S34 and S35 in Fig. 3.

Fig. 10 is a diagram indicating a relation between a state E and an output E _o, where the E is stored row-by-row in the line/frame buffer 93 illustrated in Fig. 9, and the E _o is logical operation result which can be used for the event filtering. It is assumed that the location of a target pixel which detected a light-strength change is indicated as (i, j) . If the pixel is represented by 2-state, it is possible to determine whether the event of the target pixel is noise, based on a logical OR of logical levels indicating respective states of the surrounding pixels. For convenience of explanation, the logical level of the output E _o is “0” (False) when the event is noise, and is “1” (True) when the event is not noise, but the logic may be the opposite.

E _o(i, j) =E (i, j) · {E (i-1, j-1) + E (i-1, j) + E (i-1, j+1) + E (i , j-1) + E (i , j) + E (i , j+1) + E (i+1, j-1) + E (i+1, j) + E (i+1, j+1) }

where “·” (dot) represents a logical AND, and “+” to E represents a logical OR. The state E has a logical level representing whether the event occurs. If the all states of 8 pixels surrounding (i, j) are False (NO event) , the output E _o (i, j) becomes “False” , and it is possible to determine that the event of the target pixel is noise. If even one of 8 is “True” (event) , the output E _o (i, j) becomes “True” , and it is possible to determine that the event of the target pixel is not noise. This corresponds to the event filtering as shown in Fig. 6. In case of an event filtering as shown in Fig. 7, it is required that a pixel is represented by at least 3-state.

If a pixel is represented by 3-state including a polarity of a change, it is possible to determine whether an event of a target pixel is a redundant event, based on

(a) a negative logical AND (NAND) of logical levels indicating respective states of the surrounding pixels which detected ON event, or

(b) a negative logical AND (NAND) of logical levels indicating respective states of the surrounding pixels which detected OFF event.

For convenience of explanation, when E _p (i, j) , E _n (i, j) is “1” (True) , and when it is indicated that an pixel located at a coordinate (i, j) has detected the ON event or the OFF event, supposing that E _op (i, j) represents an output of the ON event after the event filtering, and E _on (i, j) represents an output of the OFF event after the event filtering. However the logic may be the opposite.

E _op (i, j) = E _p (i, j) ·NAND {E (i-1, j-1) , E (i-1, j) , E (i-1, j+1) , E (i, j-1) , E (i, j) , E (i, j+1) , E (i+1, j-1) , E (i+1, j) , E (i+1, j+1) }

E _on (i, j) = E _n (i, j) ·NAND {E (i-1, j-1) , E (i-1, j) , E (i-1, j+1) , E (i, j-1) , E (i, j) , E (i, j+1) , E (i+1, j-1) , E (i+1, j) , E (i+1, j+1) }

If the detection state of the pixel E _p (i, j) , E _n (i, j) is “1” (True) , and if all 8 states of the pixels surrounding the pixel located at the coordinate (i, j) indicate the same polarity event of ON or OFF, the output E _op (i, j) or E _on (i, j) becomes “False” , it is possible to determine that the event of the target pixel is a redundant event, and not to output it. If even one of 8 is a different state, E _op (i, j) or E _on (i, j) becomes “True” , and thus it is possible to determine that the event of the target pixel is not a redundant event, and to output it.

Since 3-state representation is used in the example illustrated in Fig. 9, only 2 bits are required for one event for respective pixel states when event filtering. In this regard, it is possible to significantly reduce the required memory, compared with the method described in Background in which it is necessary to perform event filtering by using 32 bits (polarity, position and timestamp) for one event.

In an embodiment, there is a need to determine a target pixel which detected a light-strength change, and read states of one or more surrounding pixels which exist around the target pixel. Thus, it is desirable that after the target pixel is determined, the state of the target pixel is maintained without any resetting, until finishing the reading of the states of the surrounding pixels. The reason is that, for example, if the target pixel is reset or changed into a different state when reading the state of the surrounding pixel, it may cause noise in the surrounding pixel. It is preferable to maintain the state of the target pixel, in terms of ensuring a simultaneity of the states of the target pixel and surrounding pixels, and stabilizing the event filtering process.

Fig. 11 is a diagram illustratively indicating elements in a pixel circuit. In Fig. 11, a MOS transistor Mlog and a photodiode PD connected in series between a high power supply and a low power supply, a first inverted amplifier (-A1) connected between a gate and a source of the transistor Mlog, a sample-hold circuit connected to an output of the first inverted amplifier (-A1) , a threshold circuit connected to an output of the sample-hold circuit, a handshake protocol unit connected to an output of the threshold circuit, and a delay circuit connected between a reset switch of the sample-hold circuit and the handshake protocol unit are described. The sample-hold circuit includes a second inverted amplifier (-A2) , a first capacitor C1 connected between the first inverted amplifier and the second inverted amplifier, and a second capacitor C2 connected in parallel between an input and an output of the second inverted amplifier (-A2) . The threshold circuit includes a first threshold circuit TH _ON and a second threshold circuit TH _OFF.

Supposing that the reset switch of the sample-hold circuit is in a closed state. The photodiode PD, the MOS transistor Mlog, and the first inverted amplifier (-A1) generate a voltage signal Vlog corresponding to a logarithm of a photo current through the photodiode PD. The voltage signal Vlog at this timing is sampled by the sample-hold circuit. Supposing that the reset switch of the sample-hold circuit switches from closed state to open state, and the voltage signal Vlog corresponding to the photocurrent of the photodiode PD changes by delta V. The sample-hold circuit boosts the voltage difference (delta V) , and provides an output voltage V _diff for the threshold circuit (TH _ON, TH _OFF) . If the output voltage V _diff increases from the previous voltage by a threshold or more, ON event is detected. If the output voltage V _diff decreases from the previous voltage by a threshold or more, OFF event is detected. When the ON event or the OFF event is generated, the handshake protocol unit transmits a request signal (REQ_R or REQ_C) to a row or column arbiter/encoder (41R, 41C in Fig. 4, Fig. 5) , and receives an acknowledgement signal (ACK_R or ACK_C) . Note that ACK_R, ACK_C, and REQ_C correspond to RowAck_*, ColAck_*, and ColReqON (OFF) _*indicated in Fig. 5, respectively. After that, the reset switch of the sample-hold circuit switches from open state to closed state, and a voltage to be compared is changed to the current voltage (V _diff) at that timing. It is possible to adjust the timing to switch the reset switch by the delay circuit.

By the handshake protocol unit interacting with the row or column arbiter/encoder (41R, 41C) , a location of a pixel which detected a light-strength change is determined. The procedure of the location determination corresponds to the procedure of determining the target pixel, which was described with respect to the step S33 in Fig. 3. In the flow chart in the embodiment, after the step S33, the surrounding pixel states are read in the step S34.

In terms of a filtering operation in reading the surrounding pixel, it is desirable that after the target pixel is determined, the output state of the target pixel is maintained, until finishing the reading of the states of the surrounding pixels in the step S34 in Fig. 3. Thus, a certain time interval may be affixed after the target pixel is determined until the pixel is reset. This may be referred to as “dead time” . The delay circuit determines a timing to open/close the reset switch, and a timing of the reset signal HS_RST (which resets a state of a detected event) , according to the dead time. Specifically, when the reset switch of the sample-hold circuit is in the closed state, the handshake protocol unit maintains the last detection state, and also stops transmitting a new row/column request signal. When the reset switch of the sample-hold circuit is in the open state, the handshake protocol unit resets the detection state which has been maintained so far, and the detection state is changed to a new state. Controlling the reset switch to open/close and keeping the state by the handshake protocol unit can be achieved without modifying the pixel circuit construction. According to an embodiment, by controlling operations taking the dead time into consideration, the target pixel state is maintained and available in the filtering process of reading the surrounding pixel following to the target pixel.

In this way, according to an embodiment, it is determined whether an event signal of a target pixel, which includes a timestamp to the target pixel, is to be outputted from a vision sensor, based on a state of a surrounding pixel. The state of the surrounding pixel is simple information indicating whether a light strength is changed (it may include an increase or a decrease) , which does not need to include complicated information, such as a timestamp and a position. In accordance with an aspect of an embodiment, it is possible to perform event filtering more simply in an event-based vision sensor. Thus, it is possible to simply construct a read-out unit for reading a surrounding pixel, and a filtering unit for filtering an event-based on the surrounding pixel. For example, there is no need to modify a pixel circuit construction. Since the event filtering is performed before outputting the event signal of the target pixel from the vision sensor to the outside, when filtering the event, there is no need for any complicated calculation and/or operation which requires a timestamp and a position. The event-based vision sensor according to an embodiment can perform event filtering simply, without the adding of a complicated circuit construction, and without requiring a complicated operation process. This is preferable in terms of device miniaturization, operation processing load reduction, and low power consumption.

Hereinafter, an exemplary architecture of a mobile terminal, to which an embodiment of the present invention may be applied, is described with reference to Fig. 12. A mobile terminal may be a device that provides a user with a shooting function and/or data connectivity, a handheld device with a wireless connection function, or another processing device connected to a wireless modem (for example, a digital camera, a single-lens reflex camera, or a smartphone) . Alternatively, the mobile terminal may be another intelligent device with a shooting function and a display function (for example, a wearable device, a tablet computer, a PDA (Personal Digital Assistant, personal digital assistant) , a drone, or an aerial photographer) .

Fig. 12 is a schematic diagram of an optional hardware structure of a terminal 100 which is an exemplary mobile terminal. Referring to Fig. 12, the terminal 100 may include components such as a radio frequency unit 110, a memory 120, an input unit 130, a display unit 140, a imaging device 10, an audio circuit 160, a speaker 161, a microphone 162, an earphone jack 163, a processor 170, an external interface 180, and a power supply 190.

The radio frequency (RF) unit 110 may be configured to send and receive information or send and receive a signal in a call process. Generally, an RF unit includes but is not limited to an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier (LNA) , a duplexer, and the like. In addition, the radio frequency unit 110 may communicate with a network device and another device through wireless communication. Any communications standard or protocol may be used for the wireless communication.

The memory 120 may be configured to store an instruction and data. The memory 120 may mainly include an instruction storage area and a data storage area. The instruction storage area may store software such as an operating system, an application, and an instruction. The data storage area may store an image which is obtained by the imaging device 10, audio data which is inputted or outputted by the audio circuit 150, an image which is displayed by the display unit 140, data which is used for an operation process which is performed by the processor 170, and other various transient or permanent data.

The input unit 130 may be configured to receive input digit or character information in the mobile terminal 100. Specifically, the input unit 130 may include a touchscreen 131 and other input devices 132. The touchscreen 131 may collect a touch operation of the user on or near the touchscreen, and drive a corresponding connection apparatus according to a preset program. The touchscreen 131 may detect a touch action of the user on the touchscreen, convert the touch action into a touch signal, send the touch signal to the processor 170, and receive and execute a command sent by the processor 170. Another input device 132 may include but is not limited to one or more of a physical keyboard, a function key (such as a volume control key or a power on/off key) , a trackball, a mouse, a joystick, and the like.

The display unit 140 may be configured to display information input by the user, information provided for the user, various menus of the terminal 100, or the like. In the embodiments of the present invention, the display unit 140 is configured to display an image obtained by using the imaging device 10, where the image may include a preview image in some shooting modes, an image that is captured, an image that is processed by using a specific algorithm after shooting, or the like.

The imaging device 10 is configured to collect a still image or moving images and may be enabled through triggering by an application program instruction, to implement a shooting function or a video camera function. The imaging device 10 may include components such as an imaging lens, a light filter, and an image sensor. Particularly, the imaging device 10 in the embodiment includes the event-based vision sensor which is described with reference to Fig. 1. Light emitted or reflected by an object to be shot enters the imaging lens and is aggregated on the image sensor by passing through the light filter. The imaging lens is mainly configured to aggregate light emitted or reflected by an object to be shot, in a shooting field of view, and perform imaging. The light filter is mainly configured to filter out an extra light wave (for example, a light wave other than visible light, such as infrared light) from light. The image sensor is mainly configured to perform optical-to-electrical conversion on a received optical signal, convert the optical signal and the light-strength change into an electrical signal, and input the electrical signal to the processor 170 for subsequent processing.

The audio circuit 160, the speaker 161, the microphone 162, and an earphone jack 163 may provide an audio interface between the user and the mobile terminal 100. The audio circuit 160 may transmit, to the speaker 161, an electrical signal converted from received audio data, and the speaker 161 converts the electrical signal into a sound signal for output. Conversely, the microphone 162 is configured to collect a sound signal, and may convert the collected sound signal into an electrical signal. The audio circuit may also include an earphone jack 163, configured to provide a connection interface between the audio circuit and an earphone.

The processor 170 is a control center of the mobile terminal 100, and is connected to various parts of the mobile terminal 100 through various interfaces and signal lines. The processor 170 performs various functions of the mobile terminal 100, executes the instruction stored in the memory 120, and invokes the data stored in the memory 120, thereby processing the data. In some embodiments, the processor and the memory may be implemented on a single chip. In some embodiments, the processor and the memory may be separately implemented on independent chips.

The mobile terminal 100 further includes the external interface 180. The external interface 180 may be a standard micro-USB interface or a multi-pin connector. The external interface may be configured to connect the terminal 100 to another apparatus for communication, or may be configured to connect to a charger to charge the terminal 100.

The mobile terminal 100 further includes the power supply 190 (such as a battery) that supplies power to each component. Preferably, the power supply may be logically connected to the processor 170, so as to implement functions such as a charging function, a discharging function, and power consumption management by using the power supply management system.

Persons skilled in the art may understand that FIG. 12 is merely an example of the mobile terminal, and does not constitute any limitation on the embodiments. The mobile terminal may include more or fewer components than those shown in the figure, or combine some components, or have different components.

The division into parts of elements in Fig. 1, Fig. 2, Fig. 4, Fig. 5, Fig. 9, Fig. 11, Fig. 12, and so on are merely for logical function division, which prioritizes convenience of explanation. It is to be understood that some or all of the divided elements may be integrated, in actual implementation, into one physical entity, or may be physically separated. For example, each of the foregoing elements may be a separate processing element, or may be integrated on a chip of a mobile terminal. Alternatively, described processing elements may be stored in a storage element of a controller in a form of program code, and invoke and execute various functions, if necessary. In addition, the processing elements may be integrated or may be implemented independently. The processing element may be an integrated circuit chip and has a signal processing capability. In an implementation process, steps in the foregoing methods or the foregoing elements can be implemented by using a hardware integrated logical circuit in the processing element, or by using instructions in a form of software.

Persons skilled in the art will understand that the embodiments of the present invention may be provided as a method, a device, a storage medium, or a computer program. Therefore, the present invention may use a form of hardware only embodiments, or embodiments with a combination of software and hardware.

The method, the device, the storage medium, and the computer program which relate to the embodiments of the present invention are described with reference to the flowcharts and/or block diagrams. It is to be understood that computer program instructions may be used to implement each process and/or each block in the flowcharts and/or the block diagrams. These computer program instructions may be provided for a general-purpose computer, a dedicated computer, an embedded processor, or a processor of any other programmable data processing device, so that the executed instructions generate the functions described with reference to the embodiments.

These computer program instructions may be stored in an appropriate storage medium, or may be transmitted on some transmission medium. These computer program instructions may be loaded onto a computer or another programmable data processing device, so that a series of operations are performed on the computer, thereby generating the functions described with reference to the embodiments.

Although the embodiments of the present invention are described, persons skilled in the art can make changes and modifications to these embodiments. Therefore, the following claims are intended to be construed as to include the embodiments and all changes thereto and modifications, within the scope of the present invention.

LIST OF REFERENCE SYMBOLS

10 imaging device

11 event-based vision sensor

12 processor

21 pixel array

22 controller

23 detection R/O unit

24 timestamp unit

25 event signal output unit

Claims

An event-based vision sensor, the vision sensor comprising:

a pixel array including an array of pixels, the pixel being configured to detect a light-strength change as an event;

a detection unit configured to identify a pixel that generates the event;

a read-out unit configured to refer to an event detection state of a target pixel that generated the event and one or more surrounding pixels, the event detection state indicating whether there is a light-strength change;

a filtering unit configured to determine whether, based on the state of the surrounding pixel, information that is related to the event is to be outputted from the vision sensor; and

a timestamp unit configured to affix a timestamp corresponding to a generation time of the event, to the event that is determined to be outputted by the filtering unit.
The vision sensor as clamed in claim 1,

wherein the read-out unit includes:

a row selection unit configured to enable states of pixels that belong to a row of the target pixel and an adjacent row adjacent to said row to be read; and

a column R/O unit configured to enable states of pixels that belong to a column of the target pixel and an adjacent column adjacent to said column to be read.
The vision sensor as claimed in claim 1,

wherein the pixel array is scanned row-by-row or column-by-column, sequentially,

wherein the read-out unit includes a buffer memory storing states of pixels that belong to a plurality of sequential rows or columns, and

wherein the filtering unit refers to the states of the target pixel and the surrounding pixels by accessing the buffer memory, and performs a filtering process.
The vision sensor as claimed in claim 1,

wherein the filtering unit determines not to output the information that relates to the event from the vision sensor when a predetermined number of surrounding pixels that surround at least in part the target pixel does not detect a light-strength change.
The vision sensor as claimed in claim 4,

wherein the filtering unit determines whether the predetermined number of surrounding pixels detect a light-strength change, based on a logical OR of logical levels indicating respective states of the predetermined number of surrounding pixels.
The vision sensor as claimed in claim 1,

wherein the filtering unit determines not to output the information that relates to the event from the vision sensor when a predetermined number of surrounding pixels that surround at least in part the target pixel detect light-strength changes, with polarity equal to that of the target pixel.
The vision sensor as claimed in claim 6,

wherein the filtering unit determines whether the predetermined number of surrounding pixels detect light-strength changes, with polarity equal to that of the target pixel, based on

(a) a logical product of logical levels indicating respective states of the predetermined number of surrounding pixels that detected a light strength increase, or

(b) a logical product of logical levels indicating respective states of the predetermined number of surrounding pixels that detected a light strength decrease.
The vision sensor as claimed in claim 1,

wherein after the detection unit identifies a pixel that generates the event, a state of the target pixel is maintained at least until the read-out unit refers to the event detection state.
The vision sensor as claimed in claim 1,

wherein the surrounding pixels include at least 8 pixels surrounding the target pixel circularly.
An imaging device comprising:

the vision sensor as claimed in claim 1; and

a processor for processing an image based on the information that relates to an event being outputted from the vision sensor.
A method of event filtering to be performed in a vision sensor being an event-based vision sensor, the method comprising:

identifying, by a detection unit, a pixel that is configured to generate an event in a pixel array including an array of pixels, the pixel being configured to detect a light-strength change as the event;

referring to, by a read-out unit, an event detection state of a target pixel that generated the event and one or more surrounding pixels, the event detection state indicating whether a light strength is changed;

determining, by a filtering unit, whether information that relates to the event is to be outputted from the vision sensor, based on the state of the surrounding pixel; and

affixing, by a timestamp unit, a timestamp corresponding to a generation time of the event, to the event that is determined to be outputted by the filtering unit.
The method as clamed in claim 11,

wherein, the referring to of, by the read-out unit, the event detection state of the target pixel that generated the event and one or more surrounding pixels, the event detection state indicating whether the light strength is changed includes,

in response to the target pixel being determined, reading states of same-row surrounding pixels that are adjacent to the target pixel and belong to the same row as the target pixel; and

reading states of surrounding pixels which belong adjacent rows to the target pixel.
The method as claimed in claim 11,

wherein by a filtering unit, determining whether information that relates to the event is to be outputted from the vision sensor, based on the state of the surrounding pixel includes

determining not to output the information that relates to the event from the vision sensor when a predetermined number of surrounding pixels that surround at least in part the target pixel does not detect a light-strength change.
The method as claimed in claim 11,

wherein by a filtering unit, determining whether information that relates to the event is to be outputted from the vision sensor, based on the state of the surrounding pixel includes

determining not to output the information that relates to the event from the vision sensor when a predetermined number of surrounding pixels that surround at least in part the target pixel detect light-strength changes, with polarity equal to that of the target pixel.
The method as claimed in claim 11, further comprises

after, by a detection unit, identifying a pixel that generates the event, maintaining a state of the target pixel until, by a read-out unit, referring the event detection state.