I am trying to learn about image stitching using cross correlation.
Let say that I have two images A and B that I am trying to stitch. (you may need to switch to fixed width font to make these little text pictures work for you) ------- ------- | | | | | A | and | B | | | | | ------- ------- Considering a shift just in the X dimension, the maximum shift which would still have a smallest overlap would be if B is shifted (slightly less than) the width of A to the right of A, or if B is shifted (slightly less than) the width of B to the left of A. -------------- | || | | A || B | | || | -------------- -------------- | || | | B || A | | || | -------------- Let A and B both be exactly 512 by 512 pixels, that is of equal size and a power of 2 in both X and Y, so the images will need no padding when their FHT transform is created. This means that the FFT(A) and FFT(B) are also exactly 512x512. Further, using the FFDMath to calculate the inverse transform of the Correlation between FFT(A) and FFT(B) results in an image also of 512x512. But, since in our case the shift could be anywhere in the range from -511 to + 511, how can this be represented by the location of maxima drawn from an image with an X and Y size of 512? I have tried padding both my images to a size which is the next larger power of two, and for my test data this does resolve the larger shifts, however this hits performance dramatically and I suspect is not the best way to handle this. Anyone able to point me in the right direction? I understand enough of the mathematics to know that my grasp on the underlying maths of Fourier transforms is very rudimentary and shaky, so I am looking for practical advice on how to utilise the plugins in core ImageJ.jar to best solve the problem of aligning two images. Many thanks — Michael Ellis -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Good day Michael!
"This means that the FFT(A) and FFT(B) are also exactly 512x512. Further, using the FFDMath to calculate the inverse transform of the Correlation between FFT(A) and FFT(B) results in an image also of 512x512." That's to the point and a problem with this implementation of the correlation function. "I have tried padding both my images to a size which is the next larger power of two, and for my test data this does resolve the larger shifts, however this hits performance dramatically and I suspect is not the best way to handle this." That's the way to go but could please tell us what "hits performance dramatically" exactly means for you. Of course, a factor 4 in area is a lot but CPUs are fast these days... Computation here of the correlation function of two 1024x1024 images is performed in the blink of an eye. Padding is done by "Image >> Adjust >> Canvas Size..." run("Canvas Size...", "width=1024 height=1024 position=Center zero"); Regards Herbie ::::::::::::::::::::::::::::::::::::::::::: Am 26.08.17 um 12:53 schrieb Michael Ellis: > I am trying to learn about image stitching using cross correlation. > > Let say that I have two images A and B that I am trying to stitch. > > (you may need to switch to fixed width font to make these little text pictures work for you) > > ------- ------- > | | | | > | A | and | B | > | | | | > ------- ------- > > Considering a shift just in the X dimension, the maximum shift which would still have a smallest overlap would be if B is shifted (slightly less than) the width of A to the right of A, > or if B is shifted (slightly less than) the width of B to the left of A. > > -------------- > | || | > | A || B | > | || | > -------------- > -------------- > | || | > | B || A | > | || | > -------------- > > Let A and B both be exactly 512 by 512 pixels, that is of equal size and a power of 2 in both X and Y, so the images will need no padding when their FHT transform is created. > > This means that the FFT(A) and FFT(B) are also exactly 512x512. Further, using the FFDMath to calculate the inverse transform of the Correlation between FFT(A) and FFT(B) results in an image also of 512x512. > > But, since in our case the shift could be anywhere in the range from -511 to + 511, how can this be represented by the location of maxima drawn from an image with an X and Y size of 512? > > I have tried padding both my images to a size which is the next larger power of two, and for my test data this does resolve the larger shifts, however this hits performance dramatically and I suspect is not the best way to handle this. > > Anyone able to point me in the right direction? I understand enough of the mathematics to know that my grasp on the underlying maths of Fourier transforms is very rudimentary and shaky, so I am looking for practical advice on how to utilise the plugins in core ImageJ.jar to best solve the problem of aligning two images. > > Many thanks — Michael Ellis > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Hi,
we also explain our approach in quite some detail here: https://fly.mpi-cbg.de/~saalfeld/Publications/download/imagejconf2008a.pdf and a bit more abstract here: https://academic.oup.com/bioinformatics/article/25/11/1463/332497/Globally-optimal-stitching-of-tiled-3D-microscopic It is implemented in the ImageJ plugin Image Stitching (http://imagej.net/Image_Stitching) There are some little tricks to really getting the right translation out. It is often a good idea to pad images anyways because of the periodicity of the fourier space. Hope this helps a bit, Stephan --- Dr. Stephan Preibisch Group Leader Berlin Institute for Medical Systems Biology (BIMSB), Max Delbrück Center for Molecular Medicine (MDC) Building 89, 1.08b email: [hidden email]<mailto:[hidden email]> web: http://preibischlab.mdc-berlin.de On 26. Aug 2017, at 13:17, Herbie <[hidden email]<mailto:[hidden email]>> wrote: Good day Michael! "This means that the FFT(A) and FFT(B) are also exactly 512x512. Further, using the FFDMath to calculate the inverse transform of the Correlation between FFT(A) and FFT(B) results in an image also of 512x512." That's to the point and a problem with this implementation of the correlation function. "I have tried padding both my images to a size which is the next larger power of two, and for my test data this does resolve the larger shifts, however this hits performance dramatically and I suspect is not the best way to handle this." That's the way to go but could please tell us what "hits performance dramatically" exactly means for you. Of course, a factor 4 in area is a lot but CPUs are fast these days... Computation here of the correlation function of two 1024x1024 images is performed in the blink of an eye. Padding is done by "Image >> Adjust >> Canvas Size..." run("Canvas Size...", "width=1024 height=1024 position=Center zero"); Regards Herbie ::::::::::::::::::::::::::::::::::::::::::: Am 26.08.17 um 12:53 schrieb Michael Ellis: I am trying to learn about image stitching using cross correlation. Let say that I have two images A and B that I am trying to stitch. (you may need to switch to fixed width font to make these little text pictures work for you) ------- ------- | | | | | A | and | B | | | | | ------- ------- Considering a shift just in the X dimension, the maximum shift which would still have a smallest overlap would be if B is shifted (slightly less than) the width of A to the right of A, or if B is shifted (slightly less than) the width of B to the left of A. -------------- | || | | A || B | | || | -------------- -------------- | || | | B || A | | || | -------------- Let A and B both be exactly 512 by 512 pixels, that is of equal size and a power of 2 in both X and Y, so the images will need no padding when their FHT transform is created. This means that the FFT(A) and FFT(B) are also exactly 512x512. Further, using the FFDMath to calculate the inverse transform of the Correlation between FFT(A) and FFT(B) results in an image also of 512x512. But, since in our case the shift could be anywhere in the range from -511 to + 511, how can this be represented by the location of maxima drawn from an image with an X and Y size of 512? I have tried padding both my images to a size which is the next larger power of two, and for my test data this does resolve the larger shifts, however this hits performance dramatically and I suspect is not the best way to handle this. Anyone able to point me in the right direction? I understand enough of the mathematics to know that my grasp on the underlying maths of Fourier transforms is very rudimentary and shaky, so I am looking for practical advice on how to utilise the plugins in core ImageJ.jar to best solve the problem of aligning two images. Many thanks — Michael Ellis -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Dear Stephan,
Many thanks - I will go and follow those links! Some learning to be done! — Michael > On 26 Aug 2017, at 12:29, [hidden email] <[hidden email]> wrote: > > Hi, > > we also explain our approach in quite some detail here: https://fly.mpi-cbg.de/~saalfeld/Publications/download/imagejconf2008a.pdf and a bit more abstract here: https://academic.oup.com/bioinformatics/article/25/11/1463/332497/Globally-optimal-stitching-of-tiled-3D-microscopic > > It is implemented in the ImageJ plugin Image Stitching (http://imagej.net/Image_Stitching) > > There are some little tricks to really getting the right translation out. It is often a good idea to pad images anyways because of the periodicity of the fourier space. > > Hope this helps a bit, > Stephan > --- > > Dr. Stephan Preibisch > Group Leader > > Berlin Institute for Medical Systems Biology (BIMSB), Max Delbrück Center for Molecular Medicine (MDC) > Building 89, 1.08b > > email: [hidden email]<mailto:[hidden email]> > web: http://preibischlab.mdc-berlin.de > > On 26. Aug 2017, at 13:17, Herbie <[hidden email]<mailto:[hidden email]>> wrote: > > Good day Michael! > > "This means that the FFT(A) and FFT(B) are also exactly 512x512. Further, using the FFDMath to calculate the inverse transform of the Correlation between FFT(A) and FFT(B) results in an image also of 512x512." > > That's to the point and a problem with this implementation of the correlation function. > > "I have tried padding both my images to a size which is the next larger power of two, and for my test data this does resolve the larger shifts, however this hits performance dramatically and I suspect is not the best way to handle this." > > That's the way to go but could please tell us what "hits performance dramatically" exactly means for you. Of course, a factor 4 in area is a lot but CPUs are fast these days... > > Computation here of the correlation function of two 1024x1024 images is performed in the blink of an eye. Padding is done by > "Image >> Adjust >> Canvas Size..." > > run("Canvas Size...", "width=1024 height=1024 position=Center zero"); > > Regards > > Herbie > > > ::::::::::::::::::::::::::::::::::::::::::: > Am 26.08.17 um 12:53 schrieb Michael Ellis: > I am trying to learn about image stitching using cross correlation. > Let say that I have two images A and B that I am trying to stitch. > (you may need to switch to fixed width font to make these little text pictures work for you) > ------- ------- > | | | | > | A | and | B | > | | | | > ------- ------- > Considering a shift just in the X dimension, the maximum shift which would still have a smallest overlap would be if B is shifted (slightly less than) the width of A to the right of A, > or if B is shifted (slightly less than) the width of B to the left of A. > -------------- > | || | > | A || B | > | || | > -------------- > -------------- > | || | > | B || A | > | || | > -------------- > Let A and B both be exactly 512 by 512 pixels, that is of equal size and a power of 2 in both X and Y, so the images will need no padding when their FHT transform is created. > This means that the FFT(A) and FFT(B) are also exactly 512x512. Further, using the FFDMath to calculate the inverse transform of the Correlation between FFT(A) and FFT(B) results in an image also of 512x512. > But, since in our case the shift could be anywhere in the range from -511 to + 511, how can this be represented by the location of maxima drawn from an image with an X and Y size of 512? > I have tried padding both my images to a size which is the next larger power of two, and for my test data this does resolve the larger shifts, however this hits performance dramatically and I suspect is not the best way to handle this. > Anyone able to point me in the right direction? I understand enough of the mathematics to know that my grasp on the underlying maths of Fourier transforms is very rudimentary and shaky, so I am looking for practical advice on how to utilise the plugins in core ImageJ.jar to best solve the problem of aligning two images. > Many thanks — Michael Ellis > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
In reply to this post by Herbie
Michael,
for shift-detection, the computation of cross-correlation functions is a brute force approach and not very clever! In fact, you extensively check the similarity of two signals for _all_ possible relative shifts. If you compute the correlation function via the FFT, then you need to enlarge (double) the signal supports in order to be able to detect large shifts. However, that's not the whole story because in general you will get a center peak indicating zero shift. You can avoid this and make the result more stringent if you subtract the signal-mean before enlarging the signal supports with zeros. You can check this by trying the following macro: (watch for mailer-introduced line-breaks) ///////////////////////////////////////////////////////////////////////// r = getBoolean( "Correlate without mean?" ); setBatchMode( true ); saveSettings(); run( "Set Measurements...", "mean redirect=None decimal=4" ); newImage( "Img-1", "8-bit random", 768, 768, 1 ); run( "Duplicate...", " " ); rename( "Img-2" ); run( "Canvas Size...", "width=512 height=512 position=Center zero" ); run( "32-bit" ); if ( r ) { getRawStatistics( N, mn ); run( "Subtract...", "value=" + d2s( mn, 9 ) ); } run( "Canvas Size...", "width=1024 height=1024 position=Center zero" ); selectWindow( "Img-1" ); run( "Canvas Size...", "width=512 height=512 position=Top-Left zero" ); run( "32-bit" ); if ( r ) { getRawStatistics( N, mn ); run( "Subtract...", "value=" + d2s( mn, 9 ) ); } run( "Canvas Size...", "width=1024 height=1024 position=Center zero" ); run( "FD Math...", "image1=Img-1 operation=Correlate image2=Img-2 result=Result do" ); getRawStatistics( N, mn, mi, mx ); run( "Divide...", "value=" + d2s( mx, 9 ) ); if ( r ) { getRawStatistics( N, mn, mi ); setMinAndMax( mi, -mi ); } else { resetMinAndMax; } run( "Find Maxima...", "noise=1 output=[Point Selection]" ); run( "Measure" ); restoreSettings; setBatchMode( false ); exit(); ///////////////////////////////////////////////////////////////////////// "I understand the concept of converting spatial info in to frequency domain, but the correlation aspect is a bit of a black box to me!" In fact, things may not appear evident and require some knowledge of properties of the Fourier-transformation. The multiplication of two (complex-valued) Fourier-spectra corresponds to the convolution-operation applied to the corresponding two signals. The convolution-operation (an integral expression) is quite similar to the correlation integral. The difference is just that the coordinates of one of the two real-valued signals are inverted. This means in the Fourier-domain that one of the two Fourier-spectra must be taken conjugate-complex, i.e. its imaginary part receives a negative sign. "Also, I am finding in my investigations that sometimes the reported maxima is reported differently (size/2 -cx) or size/2+cx) if the shift is more or less than size/2 (where size is the padded image size)?" I'm not sure whether I understand this correctly but be aware that the correlation operation is _not_ commutative, i.e. (a corr b) != (b corr a). Best Herbie ::::::::::::::::::::::::::::::::::::::::::: Am 26.08.17 um 13:29 schrieb Michael Ellis: > Herbie, > > Thanks for fast reply, especially on a weekend! > > Dramatically is indeed a very relative term here, but my image tiles > are likely to be 2048 by 2048. If padding is the right way to go then > I’ll go with that, it just seems surprising as intuitively, by eye, I > can see the relative translation required from the directly from the > images so presenting the algorithm with a whole load of padded 0’s > seems less than obvious. > > That said, my grasp of the underlying maths, as said is shaky. I > understand the concept of converting spatial info in to frequency > domain, but the correlation aspect is a bit of a black box to me! > > Also, I am finding in my investigations that sometimes the reported > maxima is reported differently (size/2 -cx) or size/2+cx) if the > shift is more or less than size/2 (where size is the padded image > size)? > > Obviously this is well understood by people brighter than me, since > the FiJi pairwise stitching algorithm gets it right every time! > > — Michael Ellis > > >> On 26 Aug 2017, at 12:17, Herbie <[hidden email]> wrote: >> >> Good day Michael! >> >> "This means that the FFT(A) and FFT(B) are also exactly 512x512. >> Further, using the FFDMath to calculate the inverse transform of >> the Correlation between FFT(A) and FFT(B) results in an image also >> of 512x512." >> >> That's to the point and a problem with this implementation of the >> correlation function. >> >> "I have tried padding both my images to a size which is the next >> larger power of two, and for my test data this does resolve the >> larger shifts, however this hits performance dramatically and I >> suspect is not the best way to handle this." >> >> That's the way to go but could please tell us what "hits >> performance dramatically" exactly means for you. Of course, a >> factor 4 in area is a lot but CPUs are fast these days... >> >> Computation here of the correlation function of two 1024x1024 >> images is performed in the blink of an eye. Padding is done by >> "Image >> Adjust >> Canvas Size..." >> >> run("Canvas Size...", "width=1024 height=1024 position=Center >> zero"); >> >> Regards >> >> Herbie >> >> >> ::::::::::::::::::::::::::::::::::::::::::: Am 26.08.17 um 12:53 >> schrieb Michael Ellis: >>> I am trying to learn about image stitching using cross >>> correlation. Let say that I have two images A and B that I am >>> trying to stitch. (you may need to switch to fixed width font to >>> make these little text pictures work for you) ------- >>> ------- | | | | | A | and | B | | | | >>> | ------- ------- Considering a shift just in the X >>> dimension, the maximum shift which would still have a smallest >>> overlap would be if B is shifted (slightly less than) the width >>> of A to the right of A, or if B is shifted (slightly less than) >>> the width of B to the left of A. -------------- | || | | >>> A || B | | || | -------------- -------------- | || >>> | | B || A | | || | -------------- Let A and B both >>> be exactly 512 by 512 pixels, that is of equal size and a power >>> of 2 in both X and Y, so the images will need no padding when >>> their FHT transform is created. This means that the FFT(A) and >>> FFT(B) are also exactly 512x512. Further, using the FFDMath to >>> calculate the inverse transform of the Correlation between FFT(A) >>> and FFT(B) results in an image also of 512x512. But, since in our >>> case the shift could be anywhere in the range from -511 to + 511, >>> how can this be represented by the location of maxima drawn from >>> an image with an X and Y size of 512? I have tried padding both >>> my images to a size which is the next larger power of two, and >>> for my test data this does resolve the larger shifts, however >>> this hits performance dramatically and I suspect is not the best >>> way to handle this. Anyone able to point me in the right >>> direction? I understand enough of the mathematics to know that my >>> grasp on the underlying maths of Fourier transforms is very >>> rudimentary and shaky, so I am looking for practical advice on >>> how to utilise the plugins in core ImageJ.jar to best solve the >>> problem of aligning two images. Many thanks — Michael Ellis -- >>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >> >> -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
In reply to this post by Michael P Ellis
Hi Michael,
here is another way to see it: When working with Fourier transforms, you have to consider your image periodically repeated in x and y, up to inifinity: (again ASCII painting, to be viewed with fixed font) ... ------------------- | | | | ... | A | A | A | ... | | | | ------------------- | | | | ... | A | A | A | ... | | | | ------------------- ... So when calculating the correlation between A and B, both 512x512 pxl, you cannot distinguish between a shift by -10 and +502 pixels; for the periodically repeated images it is the same. In a correlation calculated without FFT, the area where the images do not overlap does not contribute. In a correlation calculated with the FFT, there is no region where the images do not overlap; there is always overlap with the periodic continuation. If the images have less than 1/2 overlap, it is therefore better to use smaller images (e.g. 256x256 on the right side of A and 256x265 on the left side of B) and do the correlation with them. This makes it faster and also makes it less likely that you get some spurious maximum from the periodic repetitions where you do not want to check for overlap. Of course, you have to roughly know how the images are displaced with respect to each other, or you try with different subimages and see where you get the best correlation. If the overlap is smaller, you can try smaller subimages than half the image size. For best results, one should do it iteratively, and in the second iteration use only these parts of the image that actually overlap according to the first result. To fill up the 2^n size needed for the FFT, it would be best to have some smooth extrapolation taking the periodic boundary conditions into account. Concerning the implementation in ImageJ: If you do Process>FFT>FD Math and correlate A with B (and do the inverse transform), the maximum is in the center (x = y = 256 for 512-pxl images) if the images are the same. The maximum shifted to the left of center if image A is shifted with respect to B to the left by up to 256 pixels. If A is shifted to the left by more than 256 pixels, the maximum jumps to the right border of the correlation output -- also the autocorrelation calculated by the FFT has periodic boundary conditions. You can use 'swap quadrants' to get the maximum for non-shifted images from the center to (0,0), but still mind the periodic boundary conditions: Now a displacement by one pixel to the left will put the maximum to (511, 0). Michael ________________________________________________________________ On 2017-08-26 12:53, Michael Ellis wrote: > I am trying to learn about image stitching using cross correlation. > > Let say that I have two images A and B that I am trying to stitch. > > (you may need to switch to fixed width font to make these little text > pictures work for you) > > ------- ------- > | | | | > | A | and | B | > | | | | > ------- ------- > > Considering a shift just in the X dimension, the maximum shift which > would still have a smallest overlap would be if B is shifted (slightly > less than) the width of A to the right of A, > or if B is shifted (slightly less than) the width of B to the left of > A. > > -------------- > | || | > | A || B | > | || | > -------------- > -------------- > | || | > | B || A | > | || | > -------------- > > Let A and B both be exactly 512 by 512 pixels, that is of equal size > and a power of 2 in both X and Y, so the images will need no padding > when their FHT transform is created. > > This means that the FFT(A) and FFT(B) are also exactly 512x512. > Further, using the FFDMath to calculate the inverse transform of the > Correlation between FFT(A) and FFT(B) results in an image also of > 512x512. > > But, since in our case the shift could be anywhere in the range from > -511 to + 511, how can this be represented by the location of maxima > drawn from an image with an X and Y size of 512? > > I have tried padding both my images to a size which is the next larger > power of two, and for my test data this does resolve the larger > shifts, however this hits performance dramatically and I suspect is > not the best way to handle this. > > Anyone able to point me in the right direction? I understand enough of > the mathematics to know that my grasp on the underlying maths of > Fourier transforms is very rudimentary and shaky, so I am looking for > practical advice on how to utilise the plugins in core ImageJ.jar to > best solve the problem of aligning two images. > > Many thanks — Michael Ellis > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Free forum by Nabble | Edit this page |