SIFT (Scale-Invariant Feature Transform)

Autonomous Vehicle/Video Geometry

SIFT (Scale-Invariant Feature Transform)

Naranjito 2025. 4. 3. 17:52

SIFT is a method to:

- Detect keypoints (interesting, stable points in the image)

- Describe them in a way that is invariant to scale and rotation

Scale Invariance

- SIFT is robust to images taken at different zoom levels (scales).
- It does this by detecting keypoints in multiple scales, not just a single image resolution.

Detector: Difference of Gaussian (DoG)

D (x, y, σ) = L (x, y, k σ) - L (x, y, σ)

- L(x,y,σ) : the image blurred with a Gaussian filter of scale σ

- k : a constant factor for the next level of scale

- Subtracting two Gaussians of different scales creates the DoG image

- Each row of rectangles represents images at different blur levels (σ) of the same input image.

- The bottom row is the first octave (original resolution)

- The rows above are next octaves (downsampled by 1/2 each time)

- Within each octave, the image is blurred incrementally using different values of Gaussian σ (standard deviation)

- Each oval between pairs of rectangles is computing the Difference of Gaussians:

D (x, y, σ) = L (x, y, k σ) - L (x, y, σ)

- This subtracts two blurred images at slightly different levels of blur, resulting in DoG images.

- Once a set of DoG images is computed for one resolution (octave), the image is downsampled by 2 (half size), and the process repeats:

- New blurred versions are created at that smaller size

- More DoG images are computed

- This helps SIFT detect features that may appear at different real-world sizes (scales).

Input image
 ↓
Apply Gaussian blur repeatedly → [Gaussian Pyramid]
 ↓
Subtract adjacent Gaussians → [DoG images]
 ↓
Find keypoints as local extrema in DoG stack
 ↓
Downsample and repeat for next octave

Divide the Patch Around the Keypoint

- Around each keypoint, take a 16×16 pixel window.

- Split it into 4×4 = 16 cells (each is 4×4 pixels).

- In each cell, compute a histogram of gradient orientations (8 bins).

- So each 4×4 block becomes an 8-element orientation histogram.

Descriptor Size

- You have 16 blocks (4×4)

- Each block has 8 orientation values

- Resulting in:

16×8=128

- The whole descriptor is rotated so that it aligns with the keypoint’s dominant orientation

- This ensures that the same patch rotated in a different image will still match.

Feature	SIFT (Scale-Invariant Feature Transform)	SURF (Speeded Up Robust Features)	KAZE (Nonlinear Scale Space)
Scale Space	Gaussian pyramid (linear scale space)	Gaussian pyramid (linear scale space)	Nonlinear diffusion (preserves edges and boundaries)
Detector	Difference of Gaussian (DoG)	Hessian matrix (approximated by box filters)	Determinant of Hessian matrix in nonlinear space
Descriptor	Gradient orientation histograms	Haar wavelet responses	First-order image derivatives
Rotation Invariance	Yes	Yes	Yes
Illumination Invariance	Normalized gradient magnitude	Normalized Haar responses	Normalized first-order derivatives
Computation Speed	Slower	Faster than SIFT	Slower (due to nonlinear PDE solving)
Robustness	Good for scale and rotation	Good, faster alternative to SIFT	Better at preserving edge structures and boundaries
Descriptor Dimension	128	64	64
Key Innovation	Scale-space extrema + gradient histograms	Integral image + box filters for speed	Nonlinear scale space via anisotropic diffusion

https://blog.naver.com/pig9456/223476127316

저작자표시

'Autonomous Vehicle > Video Geometry' 카테고리의 다른 글

KAZE (0)	2025.04.03
SURF (Speeded Up Robust Features) (0)	2025.04.03
Homography (0)	2024.07.25
Camera Calibration (0)	2024.07.23
3D Transformations (0)	2024.07.19

현재글SIFT (Scale-Invariant Feature Transform)

kafka, batch size, classmethod, Regular Expression, docker-compose, randn, nvidia-smi, d3js, abstractmethod, Filter, global variable, yield from, axis, zeros, Step Function, textdistance, Sigmoid function, selectall, forward propagation, cross-entropy,

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

¡Hola, Mundo!