http://imagej.273.s1.nabble.com/Questions-regarding-multithreaded-processing-tp5016878p5016879.html
> Dear all,
> I’ve assembled a plugin to analyze a time series on a pixel-by-pixel basis.
> It works fine but is slow.
> There are likely still plenty of optimizations that can be done to improve
> the speed and thanks to Albert Cordona and Stephen Preibisch sharing code
> and tutorials (
http://albert.rierol.net/imagej_programming_tutorials.html),
> I’ve even have a version that runs multi-threaded.
> When run on multi-core machines the speed is improved, but I’m not sure what
> sort of improvement I should expect. Moreover, the machines I expected to
> be the fastest are not. This is likely stemming from my misunderstanding of
> parallel processing and Java programming in general so I’m hoping some of
> you with more experience can provide some feedback.
> I list below some observations and questions along with test runs on the
> same data set using the same plugin on a few different machines.
> Thanks for any suggestions.
> George
>
>
> Since the processing speeds differ, I realize the speeds of each machine to
> complete the analysis will differ. I’m more interested the improvement of
> multiple threads on an individual machine.
> In running these tests, I altered the code to use a different number of
> threads in each run.
> Is setting the number of threads in the code and determining the time to
> finish the analysis a valid approach to testing improvement?
>
> Machine 5 is producing some odd behavior which I’ll discuss and ask for
> suggestions below.
>
> For machines 1-4, the speed improves with the number of threads up to about
> half the number of available processors.
> Do the improvements with the number of threads listed below seem reasonable?
> Is the improvement up to only about half the number of available processors
> due to “hyperthreading”? My limited (and probably wrong) understanding is
> that hyperthreading makes a single core appear to be two which share
> resources and thus a machine with 2 cores will return 4 when queried for
> number of cpus. Yes, I know that is too simplistic, but it’s the best I can
> do.
> Could it simply be that my code is not written properly to take advantage of
> hyperthreading? Could anyone point me to a source and/or example code
> explaining how I could change it to take advantage of hyperthreading if this
> is the problem?
>
> Number of threads used are shown in parentheses where applicable.
> 1. MacBook Pro 2.66 GHz Intel Core i7
> number of processors: 1
> Number of cores: 2
> non-threaded plugin version ~59 sec
> threaded (1) ~51 sec
> threaded (2) ~36 sec
> threaded (3) ~34 sec
> threaded (4) ~35 sec
>
> 2. Mac Pro 2 x 2.26 GHz Quad-Core Intel Xeon
> number of processors: 2
> Number of cores: 8
> non-threaded plugin version ~60 sec
> threaded (1) ~59 sec
> threaded (2) ~28.9 sec
> threaded (4) ~15.6 sec
> threaded (6) ~13.2 sec
> threaded (8) ~11.3 sec
> threaded (10) ~11.1 sec
> threaded (12) ~11.1 sec
> threaded (16) ~11.5 sec
>
> 3. Windows 7 DELL 3.2 GHz Intel Core i5
> number of cpus shown in resource monitor: 4
> non-threaded plugin version ~45.3 sec
> threaded (1) ~48.3 sec
> threaded (2) ~21.7 sec
> threaded (3) ~20.4 sec
> threaded (4) ~21.8 sec
>
> 4. Windows 7 Xi MTower 2P64 Workstation 2 x 2.1 GHz AMD Opteron 6272
> number of cpus shown in resource monitor: 32
> non-threaded plugin version ~162 sec
> threaded (1) ~158 sec
> threaded (2) ~85.1 sec
> threaded (4) ~46 sec
> threaded (8) ~22.9 sec
> threaded (10) ~18.6 sec
> threaded (12) ~16.4 sec
> threaded (16) ~15.8 sec
> threaded (20) ~15.7 sec
> threaded (24) ~15.9 sec
> threaded (32) ~16 sec
>
> For machines 1-4, the cpu usage can be observed in the Activity Monitor
> (Mac) or Resource Monitor (Windows) and during the execution of the plugin
> all of the cpus were active. For machine 5 shown below, only 22 of the 64
> show activity. And it is not always the same 22. From the example runs
> below you can see it really isn’t performing very well considering the
> number of available cores. I originally thought this machine should be the
> best, but it barely outperforms my laptop. This is probably a question for
> another forum, but I am wondering if anyone else has encountered anything
> similar.
>
> 5. Windows Server 2012 Xi MTower 2P64 Workstation 4 x 2.4 GHz AMD Opteron
> 6378
> number of cpus shown in resource monitor: 64
> non-threaded plugin version ~140 sec
> threaded (1) ~137 sec
> threaded (4) ~60.3 sec
> threaded (8) ~29.3 sec
> threaded (12) ~22.9 sec
> threaded (16) ~23.8 sec
> threaded (24) ~24.1 sec
> threaded (32) ~24.5 sec
> threaded (40) ~24.8 sec
> threaded (48) ~23.8 sec
> threaded (64) ~24.8 sec
>
>
>
>
>
>
>
> --
> View this message in context:
http://imagej.1557.x6.nabble.com/Questions-regarding-multithreaded-processing-tp5016878.html> Sent from the ImageJ mailing list archive at Nabble.com.
>
> --
> ImageJ mailing list:
http://imagej.nih.gov/ij/list.html>