A theoritical question about SIFT implementation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

A theoritical question about SIFT implementation

Xin ZHOU
Hello, everyone

I'm now preparing a theoretical explanation of SIFT implementation.
I have two theoretical questions about SIFT.
I wonder if I could get help here:

1st :
- Generally to find edges or corners or salient points, mathematically we are looking for
gradient (1st derivative) extrema, or Laplacien (2nd derivative) Zero-crossing points
Question is, DoG simulated LoG in such a way that we can almost take it as a Laplacien filter.
Then why we use Non-Maximum Suppression (NMS)?
Isn't it Zero-crossing that we should look for? When we look for Maximum??

2nd :
- When we extract SIFT descriptors, we do it with gradient map .....
It is strange, as what we have is n Gaussian maps and n-1 DoG map, the latter are Laplacien ones, not a gradient maps....
So if I understand correctly, DoG is used for salient point detection.
But for descriptor, it is extracted from Gaussian maps, but with a certain gradient filter (Sobel for example). Am I right?
If so, then why? This augments computing complexity.
Why could not we just use values in DoG maps(2nd order derivation)??
Mathematically, 2nd derivation is less stable / robust / repesentitive ???

I hope I formulate correctly the questions, in fact it is all about 1st and 2nd order derivation.
I just need a reasonable explaination.

cheers, Xin

--
Mr. Xin Zhou,
Doctorant Assistant
HUG - SIM
Rue Gabrielle-Perret-Gentil 4
CH-1211 GENEVE 14

tel +41 22 37 26279
mobile +41 765 307 960
Reply | Threaded
Open this post in threaded view
|

Re: A theoritical question about SIFT implementation

Stephan Saalfeld
Hi Xin,

> 1st :
> - Generally to find edges or corners or salient points, mathematically
> we are looking for
> gradient (1st derivative) extrema, or Laplacien (2nd derivative)
> Zero-crossing points

You can do that, but that's not what is done in general by SIFT.  DoG is
a shortcut to find places with maximal signal change speed (note that
this is a second derivative and is usually referred to as curvature) in
two dimensions (not just in one) that also is an extremum in scale.  DoG
or LoG extrema include those being the sum of both curvatures.

> Question is, DoG simulated LoG in such a way that we can almost take
> it as a Laplacien filter.

Yes.

> Then why we use Non-Maximum Suppression (NMS)?
> Isn't it Zero-crossing that we should look for? When we look for
> Maximum??
>

The extrema found have to be checked for really being the sum of two
significant curvatures and not may be just one.

> 2nd :

You missed orientation estimation.  From the scaled gradients around a
detection, the dominant orientation is estimated using a gradient
histogram.

> - When we extract SIFT descriptors, we do it with gradient map .....
> It is strange, as what we have is n Gaussian maps and n-1 DoG map, the
> latter are Laplacien ones, not a gradient maps....

I do not see anything strange in that.

> So if I understand correctly, DoG is used for salient point
> detection.

Yes, including the gradients for orientation estimation.

> But for descriptor, it is extracted from Gaussian maps, but with a
> certain gradient filter (Sobel for example). Am I right?
> If so, then why? This augments computing complexity.

No, because you have it already from the previous step.

> Why could not we just use values in DoG maps(2nd order derivation)??
> Mathematically, 2nd derivation is less stable / robust /
> repesentitive ???
>

Depends with respect to what.

> I hope I formulate correctly the questions, in fact it is all about
> 1st and 2nd order derivation.
> I just need a reasonable explaination.
>

For preparing a theoretical explanation, you should read Lowe's paper
(IJCV, 2004) in full, where all this is clearly and readably explained.

Best,
Stephan