Login  Register

Re: Questions regarding multithreaded processing

Posted by George Patterson on Jul 18, 2016; 3:15pm
URL: http://imagej.273.s1.nabble.com/Questions-regarding-multithreaded-processing-tp5016878p5016905.html

Hi Michael,
Thanks for the feedback.

>
> For comparing the speed with one or two Minimizer threads, this means that
> you have to compare like the following:
>
> new, one minimizer thread                                previous
>  threaded (2) ~108.9 sec                                 158 sec
>  threaded (4) ~64.5 sec                                   85.1 sec
>  threaded (8) ~35.2 sec                                   46   sec
>  threaded (16) ~17.7 sec                                  22.9 sec
>  threaded (32) ~10 sec                                    15.8 sec
>  threaded (64)                                            16 sec
>

Of course.  Sorry for the mix-up.


> So it clearly helps to use one Minimizer thread; possibly the main reason
> is avoiding the overhead of creating a Minimizer thread for each pixel and
> the accompanying synchronization between the two Minimizer threads.
>
> The table also tells that the gain with parallelization is not so bad:
> A factor of 20 from 1 to 32 threads, so the total time is not dominated by
> the 'Stop the world' events for Garbage collection or memory bandwidth.
>
>


> --
>
> Concerning the curve fitting problem, "Exponential with offset", y =
> a*exp(-bx) + c:
>
> The CurveFitter eliminates two parameters (a, c) by linear regression, so
> it actually performs a one-dimensional minimization.  I guess that this
> problem is well-behaved and the Minimizer always finds the correct result
> in the first attempt. Then it is not necessary to try a second run.
>   So you can try:
>   myCurveFitter.getMinimizer().setMaxRestarts(0);
> This makes the Minimizer run only once, with no second try to make sure
> the result is correct. It also avoids a second thread.
> I would suggest that you try it and compare whether the result is the same
> (there might be tiny differences since minimization is stochastic and the
> accuracy is finite).
>

Some new comparisons are below to see if it is behaving as you expect.

4. Windows 7 Xi MTower 2P64 Workstation         2 x 2.1 GHz  AMD Opteron
6272
number of cpus shown in resource monitor: 32

myCurveFitter.getMinimizer().setMaxRestarts(0);
threads (1) 103.5 sec
threads (2)   55 sec
threads (4)   33.3 sec
threads (8)  18.2 sec
threads (16)  9.7 sec
threads (32)  5.6 sec

Seems to make it even faster.  I think this is what you predicted.



4. Windows 7 Xi MTower 2P64 Workstation         2 x 2.1 GHz  AMD Opteron
6272
number of cpus shown in resource monitor: 32

myCurveFitter.getMinimizer().setMaximumThreads(1);
myCurveFitter.getMinimizer().setMaxRestarts(0);
threads (1)   205.3 sec
threads (2)    107.6 sec
threads (4)     64.9 sec
threads (8)    34.9 sec
threads (16)   17.9 sec
threads (32)   10.2 sec

There is likely no reason to both of these commands together, but it seems
to give the same as only using
myCurveFitter.getMinimizer().setMaximumThreads(1);
I included this just to see if this is expected behavior.


> If it works as I expect, it should cut the time for minimization to 1/2.
> If the decrease in processing time is comparable, it would mean that
> computing time is still dominated by the Minimizer, not garbage collection
> (and and the rest of processing each pixel, including memory access). If
> the speed gain is only marginal, it would indicate that optimization
> should focus on garbage collection and the non-minimizer operations per
> pixel.
>
> What you might also do to speed up the process: If you have a good guess
> for the 'b' parameter and the typical uncertainty of this guess for 'b',
> specify them in the initialPrams and initialParamVariations.  E.g. if 'b'
> does not change much between neighboring pixels, use the previous value
> for initialization. The default value of the initialParamVariations for
> 'b' is 10% of the specified 'b' value.
> Don't care about the initial the 'a' and 'c' parameters and their range;
> these values will be ignored.
>

Thanks for the suggestions.  They've given some ideas to incorporate into
the plugin.

I wasn't really expecting any miracle speed up for my plugin.  Just the
suggestions you've made have vastly improved it.

I do notice small differences in the final results with the different
versions 2 threads versus setMaximumThreads versus setMaxRestarts, but
these seem to be at the 8th and 9th decimal places.

And thanks again for all your help.
George

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html