speed and ROIs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

speed and ROIs

Jeremy Adler-2

I am trying to speed up a macro that includes working repeatedly with a large stack.
I assumed that if I operated on a selection and ran a filter, that the filter would only operate on the selection and would
therefore be faster than using the whole stack.
However when I checked the timings using they were almost identical.

While it is reasonable that a selection only limits the volume operate on, it would be helpful if this translated into an increase in  the speed

newImage("play", "8-bit random", 256, 256, 64);
makeRectangle(103, 77, 31, 31);
t0=getTime();
run("Gaussian Blur...", "sigma=2 stack");
t1=getTime();
print('time with selection',t1-t0,' msecs');
run("Select None");
t2=getTime();
run("Gaussian Blur...", "sigma=2 stack");
t3=getTime();
print('time without selection',t3-t2,' msecs');

It occurred to me that Duplicating the volume of the selection and then filtering that would be faster, since Duplicate is fast

newImage("play", "8-bit random", 256, 256, 64);
t0=getTime();
run("Gaussian Blur...", "sigma=2 stack");
t1=getTime();
print('Gau whole stack ',t1-t0,' msecs');

newImage("play2", "8-bit random", 256, 256, 64);
t2=getTime();
makeRectangle(103, 77, 11, 11);
run("Duplicate...", "duplicate");
t3=getTime();
run("Gaussian Blur...", "sigma=2 stack");
t4=getTime();
print('Duplicate ',t3-t2,' msecs');
print('Gau with duplicated selection ',t4-t3,' msecs');

but while duplicate is fast, the time taken for the Gaussian was only marginally reduced.
Which seems strange given that the size of the duplicated image is tiny compared to the original.
What am I missing ?





Jeremy Adler

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: speed and ROIs

Gabriel Landini
On Friday, 7 July 2017 08:05:19 BST you wrote:
> I am trying to speed up a macro that includes working repeatedly with a
> large stack.
 I assumed that if I operated on a selection and ran a filter,
> that the filter would only operate on the selection and would therefore be
> faster than using the whole stack.
> However when I checked the timings using they were almost identical.
>
> While it is reasonable that a selection only limits the volume operate on,
> it would be helpful if this translated into an increase in  the speed

It is twice as fast in my machine when the image is larger (e.g. 1256x1256).
I seem to remember that some filtering in IJ is done in multiple threads. Could
that make a difference?
Cheers

Gabriel

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: speed and ROIs

Michael Schmid
In reply to this post by Jeremy Adler-2
Hi Jeremy,

part of the overhead you see in the computing times is related to the
GUI, so one should run the test in BatchMode.
One should also run the macro at least twice in succession; the first
time the Java Run Time Environment will still spend a lot of time on
optimization.

Then, on my computer (Linux, 4-core i7, 8 threads working in parallel on
the stack slices) I get this for the GaussianBlur:
    60 millisec for 256x256
    21 millisec for 11x11

For the Mean filter (radius = 2) I get the following:
    32 millisec for 256x256
     2 millisec for 11x11

With one thread only (Edit>Options>Memory&Threads) I get the following
for the GaussianBlur:
   113 millisec for 256x256
     6 millisec for 11x11 -- much faster!

and for the Mean filter (r=2):
    88 millisec for 256x256
     3 millisec for 11x11

So this tells us that the multithreading (parallelization) in the
GaussianBlur is the culprit.

For stacks, the Mean Filter does multithreading by processing different
stack slices in different threads (for single images, it processes
different image parts in different threads).
The Gaussian blur does not care whether a stack is processed or not; it
always creates threads for processing image parts, and it does that even
twice (once for the x direction, once for y).  Obviously the overhead of
creating 2*8*64=1024 threads is much more than the time required to
process the image.  [In contrast to most other parts of ImageJ, the
GaussianBlur does no processing in the main thread; most other ImageJ
code would create 7 extra threads to have 8 threads working]

This means that the issue could be solved by modifying the GaussianBlur
to use 'PARALLELIZE_STACKS' when processing stacks and in that case not
to apply any parallelization within the image.

One problem with this: If I have an 8-core cpu and a stack with only two
slices, but large image size (say 4096x4096), it would be much faster to
apply parallelization within the image (all 8 cores will be used) than
running each slice in a separate thread (only two cores working).  Of
course one could have some complex logic to decide on the
parallelization strategy, but this would depend on the CPU type,
operating system, etc.; also the breakeven point will change in a
complex way depending on the 'sigma' value of the Gaussian Blur.

I guess that most ImageJ users have large images (at least not something
as small as 11x11 pixels).  So any change of the parallelization
strategy should be at least something simple, and not adversely affect
the performance with large images.

Best,

Michael
________________________________________________________________


On 07/07/2017 10:05, Jeremy Adler wrote:

>
> I am trying to speed up a macro that includes working repeatedly with a large stack.
> I assumed that if I operated on a selection and ran a filter, that the filter would only operate on the selection and would
> therefore be faster than using the whole stack.
> However when I checked the timings using they were almost identical.
>
> While it is reasonable that a selection only limits the volume operate on, it would be helpful if this translated into an increase in  the speed
>
> newImage("play", "8-bit random", 256, 256, 64);
> makeRectangle(103, 77, 31, 31);
> t0=getTime();
> run("Gaussian Blur...", "sigma=2 stack");
> t1=getTime();
> print('time with selection',t1-t0,' msecs');
> run("Select None");
> t2=getTime();
> run("Gaussian Blur...", "sigma=2 stack");
> t3=getTime();
> print('time without selection',t3-t2,' msecs');
>
> It occurred to me that Duplicating the volume of the selection and then filtering that would be faster, since Duplicate is fast
>
> newImage("play", "8-bit random", 256, 256, 64);
> t0=getTime();
> run("Gaussian Blur...", "sigma=2 stack");
> t1=getTime();
> print('Gau whole stack ',t1-t0,' msecs');
>
> newImage("play2", "8-bit random", 256, 256, 64);
> t2=getTime();
> makeRectangle(103, 77, 11, 11);
> run("Duplicate...", "duplicate");
> t3=getTime();
> run("Gaussian Blur...", "sigma=2 stack");
> t4=getTime();
> print('Duplicate ',t3-t2,' msecs');
> print('Gau with duplicated selection ',t4-t3,' msecs');
>
> but while duplicate is fast, the time taken for the Gaussian was only marginally reduced.
> Which seems strange given that the size of the duplicated image is tiny compared to the original.
> What am I missing ?
>
>
>
>
>
> Jeremy Adler
>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html