Hi,
We acquired two very powerful computers to run Imaris software : Dell T7910 with 1 To of ram DDR4, 2 Xeon E5-2699 v4 (2.2GHz, 3.6GHz Turbo, 22C, 55Mo cache memory, 145W), 2 SSD drive of 1 To, NVIDIA P6000. It seem that with java applications these computers are 10 time slower that a Dell T7910 with 64 Go of ram, 2 Xeon E5-2630 (2.4 GHz). A 2D median filter, size=2, on an image 2048*2048*6 take 55s on the "very fast computer" compare to 3 sec on the "slower computer" !!!!! I tried to reduce the memory in Fiji option without change. I have the same results with Icy wich is also upon Java. With NovaBench I got : E5-2699 E5-2630 Global score 3723 2982 CPU score 2498 1939 What's wrong ???? Philippe -- *Philippe Mailly* /Phd, Research Engineer/ Imaging Core Facility CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France 11 place Marcelin Berthelot, 75005, PARIS, FRANCE -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Are you sure you are comparing the same setup and the same image type?
Same version of Java? Same version of IJ? Same video drivers? What OS? How much memory and parallel threads have you set in the options? Options>Memory & Threads... What results do you get with this: https://imagej.nih.gov/ij/source/ij/plugin/filter/Benchmark.java I get: ImageJ: 1.51u OS : Linux 4.4.104-39-default Java: 1.8.0_121, vm: 25.121-b13 Oracle Corporation Benchmark best: 0.227 Benchmark worst: 0.26 Cheers Gabriel -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
In reply to this post by Philippe Mailly
Hi Philippe,
there is definitely something wrong with a median radius=2 taking 55 seconds on 2048*2048*6 pixels. It takes about 3-4 seconds on my 2011-vintage Core i5 notebook (32-bit image, 4 GB for ImageJ). Please check the Edit>Options>Memory&Threads whether the number of threads for parallelization is reasonable. Is this a purely Fiji problem or does it also occur with plain ImageJ? (maybe try with the Java that comes bundled with ImageJ) Which Java version and operating system do you use? Michael ________________________________________________________________ On 2018-02-09 17:16, Philippe Mailly wrote: > Hi, > > We acquired two very powerful computers to run Imaris software : Dell > T7910 with 1 To of ram DDR4, 2 Xeon E5-2699 v4 (2.2GHz, 3.6GHz Turbo, > 22C, 55Mo cache memory, 145W), 2 SSD drive of 1 To, NVIDIA P6000. > > It seem that with java applications these computers are 10 time slower > that a Dell T7910 with 64 Go of ram, 2 Xeon E5-2630 (2.4 GHz). A 2D > median filter, size=2, on an image 2048*2048*6 take 55s on the "very > fast computer" compare to 3 sec on the "slower computer" !!!!! I tried > to reduce the memory in Fiji option without change. I have the same > results with Icy wich is also upon Java. > > With NovaBench I got : > > E5-2699 E5-2630 > > Global score 3723 2982 > > CPU score 2498 1939 > > What's wrong ???? > > Philippe > > -- > *Philippe Mailly* > /Phd, Research Engineer/ > Imaging Core Facility > CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France > 11 place Marcelin Berthelot, 75005, PARIS, FRANCE > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
In reply to this post by Gabriel Landini
With the same image 2048*2048*6, for all tests Memory= 785748Mb, CPU = 44
Image 1.51j8 java x64 1.8.0_112 Benchmark : 1.844s , 1.7 M pixel/s Median : 1.47s Fiji 1.51s java x64 1.80_162 benchmark : 2.4s Median : 45.58s Fiji 1.51S java x64 bundle 1.80_66 benchmark : 2.6s Median : 42.44s It's seem a Fiji probem ???? Philippe Le 09/02/2018 à 17:27, Gabriel Landini a écrit : > Are you sure you are comparing the same setup and the same image type? > Same version of Java? > Same version of IJ? > Same video drivers? > What OS? > How much memory and parallel threads have you set in the options? > Options>Memory & Threads... > > What results do you get with this: > https://imagej.nih.gov/ij/source/ij/plugin/filter/Benchmark.java > > I get: > ImageJ: 1.51u > OS : Linux 4.4.104-39-default > Java: 1.8.0_121, vm: 25.121-b13 Oracle Corporation > Benchmark best: 0.227 > Benchmark worst: 0.26 > > Cheers > > Gabriel > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- *Philippe Mailly* /Phd, Research Engineer/ Imaging Core Facility CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France 11 place Marcelin Berthelot, 75005, PARIS, FRANCE -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Hi,
Some new tests. If I run Fiji in 32bits the median filter speed is similar to other computers. In 64bits, all 2D filters (median, mean, variance ....) on a image stack are very slow (40s) Same type of filters in 3D (median3D, mean3D, variance 3D ...) take only 10s Philippe -- *Philippe Mailly* /Phd, Research Engineer/ Imaging Core Facility CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France 11 place Marcelin Berthelot, 75005, PARIS, FRANCE -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Hi everyone,
after some off-list discussion and tests run by Philippe: The problem of unexpectedly poor performance on a Xeon 44-core machine only appears in Fiji, not in plain ImageJ. The problem is not limited to the "RankFilters" (Process>Filters>Mean, Median, Minimum, etc., everything that uses the "Circular Masks" for the neighborhood) but one can see parallelization problems also in other functions, though not as severe. From all the evidence, it seems that sometimes one core or a few cores are not available for processing for rather long time (at least milliseconds, probably they do something else during that period). This has especially bad consequences for the current RankFilters because they expect all threads to work continuously and eventually the other threads have to wait if one thread (core) is inactive. (Due to optimized memory access, the RankFilters parallelization strategy has been very fast on machines like the Core i5 that were popular at the time when I developed it, but it gets poor under such circumstances). So the question: What is different on Fiji vs. plain ImageJ, concerning threads? The problems could be explained e.g. slower switching between threads in case one core has to handle several threads, or by some background activity that does not happen in plain ImageJ. I suspect different java options in the Launcher of Fiji vs. plain ImageJ, but I know nothing about Fiji. Michael ________________________________________________________________ On 13/02/2018 12:01, Philippe Mailly wrote: > Hi, > Some new tests. > > If I run Fiji in 32bits the median filter speed is similar to other > computers. > In 64bits, all 2D filters (median, mean, variance ....) on a image stack > are very slow (40s) > Same type of filters in 3D (median3D, mean3D, variance 3D ...) take only > 10s > > Philippe > -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
A thought from the peanut gallery ...
This issue is very interesting and I think will be worth at least a technical note depending on what you find is the true cause. I think it would be very cool to see a publication come directly out of the email list. Best - Don -----Original Message----- From: ImageJ Interest Group [mailto:[hidden email]] On Behalf Of Michael Schmid Sent: Thursday, February 15, 2018 9:26 AM To: [hidden email] Subject: Re: very very very slow process (what is different in Fiji?) Hi everyone, after some off-list discussion and tests run by Philippe: The problem of unexpectedly poor performance on a Xeon 44-core machine only appears in Fiji, not in plain ImageJ. The problem is not limited to the "RankFilters" (Process>Filters>Mean, Median, Minimum, etc., everything that uses the "Circular Masks" for the neighborhood) but one can see parallelization problems also in other functions, though not as severe. From all the evidence, it seems that sometimes one core or a few cores are not available for processing for rather long time (at least milliseconds, probably they do something else during that period). This has especially bad consequences for the current RankFilters because they expect all threads to work continuously and eventually the other threads have to wait if one thread (core) is inactive. (Due to optimized memory access, the RankFilters parallelization strategy has been very fast on machines like the Core i5 that were popular at the time when I developed it, but it gets poor under such circumstances). So the question: What is different on Fiji vs. plain ImageJ, concerning threads? The problems could be explained e.g. slower switching between threads in case one core has to handle several threads, or by some background activity that does not happen in plain ImageJ. I suspect different java options in the Launcher of Fiji vs. plain ImageJ, but I know nothing about Fiji. Michael ________________________________________________________________ On 13/02/2018 12:01, Philippe Mailly wrote: > Hi, > Some new tests. > > If I run Fiji in 32bits the median filter speed is similar to other > computers. > In 64bits, all 2D filters (median, mean, variance ....) on a image > stack are very slow (40s) Same type of filters in 3D (median3D, > mean3D, variance 3D ...) take only 10s > > Philippe > -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
In reply to this post by Michael Schmid
Prime candidates:
* parallel garbage collection * other JVM Have you tried to limit the number of threads to 1 or 2 less then the maximum number of processors if the number of available processors is very large? Thanks, Stephan On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: > Hi everyone, > > after some off-list discussion and tests run by Philippe: > > The problem of unexpectedly poor performance on a Xeon 44-core > machine > only appears in Fiji, not in plain ImageJ. > The problem is not limited to the "RankFilters" > (Process>Filters>Mean, > Median, Minimum, etc., everything that uses the "Circular Masks" for > the > neighborhood) but one can see parallelization problems also in other > functions, though not as severe. > > From all the evidence, it seems that sometimes one core or a few > cores > are not available for processing for rather long time (at least > milliseconds, probably they do something else during that period). > This > has especially bad consequences for the current RankFilters because > they > expect all threads to work continuously and eventually the other > threads > have to wait if one thread (core) is inactive. > (Due to optimized memory access, the RankFilters parallelization > strategy has been very fast on machines like the Core i5 that were > popular at the time when I developed it, but it gets poor under such > circumstances). > > So the question: What is different on Fiji vs. plain ImageJ, > concerning > threads? > > The problems could be explained e.g. slower switching between threads > in > case one core has to handle several threads, or by some background > activity that does not happen in plain ImageJ. > I suspect different java options in the Launcher of Fiji vs. plain > ImageJ, but I know nothing about Fiji. > > > Michael > ________________________________________________________________ > > > On 13/02/2018 12:01, Philippe Mailly wrote: > > > > Hi, > > Some new tests. > > > > If I run Fiji in 32bits the median filter speed is similar to > > other > > computers. > > In 64bits, all 2D filters (median, mean, variance ....) on a image > > stack > > are very slow (40s) > > Same type of filters in 3D (median3D, mean3D, variance 3D ...) take > > only > > 10s > > > > Philippe > > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html ImageJ mailing list: http://imagej.nih.gov/ij/list.html signature.asc (484 bytes) Download Attachment |
Yes as you can see on the plots.
Philippe Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : > Prime candidates: > > * parallel garbage collection > * other JVM > > Have you tried to limit the number of threads to 1 or 2 less then the > maximum number of processors if the number of available processors is > very large? > > Thanks, > Stephan > > > > On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: >> Hi everyone, >> >> after some off-list discussion and tests run by Philippe: >> >> The problem of unexpectedly poor performance on a Xeon 44-core >> machine >> only appears in Fiji, not in plain ImageJ. >> The problem is not limited to the "RankFilters" >> (Process>Filters>Mean, >> Median, Minimum, etc., everything that uses the "Circular Masks" for >> the >> neighborhood) but one can see parallelization problems also in other >> functions, though not as severe. >> >> From all the evidence, it seems that sometimes one core or a few >> cores >> are not available for processing for rather long time (at least >> milliseconds, probably they do something else during that period). >> This >> has especially bad consequences for the current RankFilters because >> they >> expect all threads to work continuously and eventually the other >> threads >> have to wait if one thread (core) is inactive. >> (Due to optimized memory access, the RankFilters parallelization >> strategy has been very fast on machines like the Core i5 that were >> popular at the time when I developed it, but it gets poor under such >> circumstances). >> >> So the question: What is different on Fiji vs. plain ImageJ, >> concerning >> threads? >> >> The problems could be explained e.g. slower switching between threads >> in >> case one core has to handle several threads, or by some background >> activity that does not happen in plain ImageJ. >> I suspect different java options in the Launcher of Fiji vs. plain >> ImageJ, but I know nothing about Fiji. >> >> >> Michael >> ________________________________________________________________ >> >> >> On 13/02/2018 12:01, Philippe Mailly wrote: >>> Hi, >>> Some new tests. >>> >>> If I run Fiji in 32bits the median filter speed is similar to >>> other >>> computers. >>> In 64bits, all 2D filters (median, mean, variance ....) on a image >>> stack >>> are very slow (40s) >>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) take >>> only >>> 10s >>> >>> Philippe >>> >> -- >> ImageJ mailing list: http://imagej.nih.gov/ij/list.html > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html *Philippe Mailly* /Phd, Research Engineer/ Imaging Core Facility CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France 11 place Marcelin Berthelot, 75005, PARIS, FRANCE -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html 2DMedian timesFiji1.51u_44CPU.tif (293K) Download Attachment 2DMedian timesImageJ1.51u_44CPU.tif (293K) Download Attachment 3DMedian timesFiji1.51u_44CPU.tif (293K) Download Attachment 3DMedian timesImageJ1.51u_44CPU.tif (293K) Download Attachment |
Thanks! Have you tried other garbage collectors?
./fiji -Xmx8g -Xincgc -- ./fiji -Xmx8g -XX:+UseParallelGC -- ./fiji -Xmx8g -XX:+UseConcMarkSweepGC -- The last one is my favorite for working with BDV and lazy caches, but utilizes parallel threads, not sure what Fiji's defaults are at this time. Thanks, Stephan On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote: > Yes as you can see on the plots. > > Philippe > > > Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : > > > > Prime candidates: > > > > * parallel garbage collection > > * other JVM > > > > Have you tried to limit the number of threads to 1 or 2 less then > > the > > maximum number of processors if the number of available processors > > is > > very large? > > > > Thanks, > > Stephan > > > > > > > > On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: > > > > > > Hi everyone, > > > > > > after some off-list discussion and tests run by Philippe: > > > > > > The problem of unexpectedly poor performance on a Xeon 44-core > > > machine > > > only appears in Fiji, not in plain ImageJ. > > > The problem is not limited to the "RankFilters" > > > (Process>Filters>Mean, > > > Median, Minimum, etc., everything that uses the "Circular Masks" > > > for > > > the > > > neighborhood) but one can see parallelization problems also in > > > other > > > functions, though not as severe. > > > > > > From all the evidence, it seems that sometimes one core or a > > > few > > > cores > > > are not available for processing for rather long time (at least > > > milliseconds, probably they do something else during that > > > period). > > > This > > > has especially bad consequences for the current RankFilters > > > because > > > they > > > expect all threads to work continuously and eventually the other > > > threads > > > have to wait if one thread (core) is inactive. > > > (Due to optimized memory access, the RankFilters parallelization > > > strategy has been very fast on machines like the Core i5 that > > > were > > > popular at the time when I developed it, but it gets poor under > > > such > > > circumstances). > > > > > > So the question: What is different on Fiji vs. plain ImageJ, > > > concerning > > > threads? > > > > > > The problems could be explained e.g. slower switching between > > > threads > > > in > > > case one core has to handle several threads, or by some > > > background > > > activity that does not happen in plain ImageJ. > > > I suspect different java options in the Launcher of Fiji vs. > > > plain > > > ImageJ, but I know nothing about Fiji. > > > > > > > > > Michael > > > ________________________________________________________________ > > > > > > > > > On 13/02/2018 12:01, Philippe Mailly wrote: > > > > > > > > Hi, > > > > Some new tests. > > > > > > > > If I run Fiji in 32bits the median filter speed is similar to > > > > other > > > > computers. > > > > In 64bits, all 2D filters (median, mean, variance ....) on a > > > > image > > > > stack > > > > are very slow (40s) > > > > Same type of filters in 3D (median3D, mean3D, variance 3D ...) > > > > take > > > > only > > > > 10s > > > > > > > > Philippe > > > > > > > -- > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > -- > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html ImageJ mailing list: http://imagej.nih.gov/ij/list.html signature.asc (484 bytes) Download Attachment |
In reply to this post by Philippe Mailly
Hi Greg,
Sorry I forgot the legend : the macro from Michael //test macro setBatchMode(true); nTries=40; maxThreads=30; threads=newArray(nTries); times=newArray(nTries); for (i=0; i<nTries; i++) { nThreads=1+round((i/(nTries-1))*(i/(nTries-1))*(maxThreads-1)); // more attempts with lower n run("Memory & Threads...", "parallel="+nThreads+" keep run"); newImage("Untitled", "32-bit random", 4096, 4096, 6); t1=getTime(); run("Median...", "radius=2 stack"); t2=getTime(); threads[i] = nThreads; times[i] = t2-t1; close(); } setBatchMode(false); Plot.create("Median times", "Number of threads", "time(ms)", threads, times); showMessage("Please reset Edit>Options>Memory&Threads to your favorite settings"); 2DMedian times Fiji1.51u_44CPU tested with Fiji (ij.jar 1.51u) 2DMedian times ImageJ1.51u_44CPU tested with ImageJ (ij.jar 1.51u) 3DMedian times Fiji1.51u_44CPU tested with Fiji (ij.jar 1.51u) 3DMedian times ImageJ1.51u_44CPU tested with ImageJ (ij.jar 1.51u) Philippe Le 15/02/2018 à 16:58, Gregory Jefferis a écrit : > @Philippe > > What do the different plots correspond to? I see that Plot #3 takes > off at 44 threads (the number of cores on your machine IIUC; physical > or logical?). > > With Fiji have you tried starting from command line with: > > --default-gc > > Stephan also seemed to think the GC was suspicious. > > Best, > > Greg. > >> On 15 Feb 2018, at 15:50, Philippe Mailly >> <[hidden email] >> <mailto:[hidden email]>> wrote: >> >> Yes as you can see on the plots. >> >> Philippe >> >> >> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : >>> Prime candidates: >>> >>> * parallel garbage collection >>> * other JVM >>> >>> Have you tried to limit the number of threads to 1 or 2 less then the >>> maximum number of processors if the number of available processors is >>> very large? >>> >>> Thanks, >>> Stephan >>> >>> >>> >>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: >>>> Hi everyone, >>>> >>>> after some off-list discussion and tests run by Philippe: >>>> >>>> The problem of unexpectedly poor performance on a Xeon 44-core >>>> machine >>>> only appears in Fiji, not in plain ImageJ. >>>> The problem is not limited to the "RankFilters" >>>> (Process>Filters>Mean, >>>> Median, Minimum, etc., everything that uses the "Circular Masks" for >>>> the >>>> neighborhood) but one can see parallelization problems also in other >>>> functions, though not as severe. >>>> >>>> From all the evidence, it seems that sometimes one core or a few >>>> cores >>>> are not available for processing for rather long time (at least >>>> milliseconds, probably they do something else during that period). >>>> This >>>> has especially bad consequences for the current RankFilters because >>>> they >>>> expect all threads to work continuously and eventually the other >>>> threads >>>> have to wait if one thread (core) is inactive. >>>> (Due to optimized memory access, the RankFilters parallelization >>>> strategy has been very fast on machines like the Core i5 that were >>>> popular at the time when I developed it, but it gets poor under such >>>> circumstances). >>>> >>>> So the question: What is different on Fiji vs. plain ImageJ, >>>> concerning >>>> threads? >>>> >>>> The problems could be explained e.g. slower switching between threads >>>> in >>>> case one core has to handle several threads, or by some background >>>> activity that does not happen in plain ImageJ. >>>> I suspect different java options in the Launcher of Fiji vs. plain >>>> ImageJ, but I know nothing about Fiji. >>>> >>>> >>>> Michael >>>> ________________________________________________________________ >>>> >>>> >>>> On 13/02/2018 12:01, Philippe Mailly wrote: >>>>> Hi, >>>>> Some new tests. >>>>> >>>>> If I run Fiji in 32bits the median filter speed is similar to >>>>> other >>>>> computers. >>>>> In 64bits, all 2D filters (median, mean, variance ....) on a image >>>>> stack >>>>> are very slow (40s) >>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) take >>>>> only >>>>> 10s >>>>> >>>>> Philippe >>>>> >>>> -- >>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >>> -- >>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >> >> -- >> *Philippe Mailly* >> /Phd, Research Engineer/ >> Imaging Core Facility >> CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France >> 11 place Marcelin Berthelot, 75005, PARIS, FRANCE >> >> -- >> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >> <2DMedian timesFiji1.51u_44CPU.tif><2DMedian >> timesImageJ1.51u_44CPU.tif><3DMedian >> timesFiji1.51u_44CPU.tif><3DMedian timesImageJ1.51u_44CPU.tif> > > -- > Gregory Jefferis, PhD > Division of Neurobiology > MRC Laboratory of Molecular Biology > Francis Crick Avenue > Cambridge Biomedical Campus > Cambridge, CB2 OQH, UK > > http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis > http://jefferislab.org > http://www.zoo.cam.ac.uk/departments/connectomics > > > -- *Philippe Mailly* /Phd, Research Engineer/ Imaging Core Facility CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France 11 place Marcelin Berthelot, 75005, PARIS, FRANCE -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
In reply to this post by Saalfeld, Stephan
Hi Stephan,
The computer on which I run the test have 1To of RAM memory should I test with -Xmx8g or with -Xmx785g as it's configured in the Fiji memory and Threads options ? Philippe Le 15/02/2018 à 17:02, Saalfeld, Stephan a écrit : > Thanks! Have you tried other garbage collectors? > > ./fiji -Xmx8g -Xincgc -- > ./fiji -Xmx8g -XX:+UseParallelGC -- > ./fiji -Xmx8g -XX:+UseConcMarkSweepGC -- > > The last one is my favorite for working with BDV and lazy caches, but > utilizes parallel threads, not sure what Fiji's defaults are at this > time. > > Thanks, > Stephan > > > On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote: >> Yes as you can see on the plots. >> >> Philippe >> >> >> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : >>> Prime candidates: >>> >>> * parallel garbage collection >>> * other JVM >>> >>> Have you tried to limit the number of threads to 1 or 2 less then >>> the >>> maximum number of processors if the number of available processors >>> is >>> very large? >>> >>> Thanks, >>> Stephan >>> >>> >>> >>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: >>>> Hi everyone, >>>> >>>> after some off-list discussion and tests run by Philippe: >>>> >>>> The problem of unexpectedly poor performance on a Xeon 44-core >>>> machine >>>> only appears in Fiji, not in plain ImageJ. >>>> The problem is not limited to the "RankFilters" >>>> (Process>Filters>Mean, >>>> Median, Minimum, etc., everything that uses the "Circular Masks" >>>> for >>>> the >>>> neighborhood) but one can see parallelization problems also in >>>> other >>>> functions, though not as severe. >>>> >>>> From all the evidence, it seems that sometimes one core or a >>>> few >>>> cores >>>> are not available for processing for rather long time (at least >>>> milliseconds, probably they do something else during that >>>> period). >>>> This >>>> has especially bad consequences for the current RankFilters >>>> because >>>> they >>>> expect all threads to work continuously and eventually the other >>>> threads >>>> have to wait if one thread (core) is inactive. >>>> (Due to optimized memory access, the RankFilters parallelization >>>> strategy has been very fast on machines like the Core i5 that >>>> were >>>> popular at the time when I developed it, but it gets poor under >>>> such >>>> circumstances). >>>> >>>> So the question: What is different on Fiji vs. plain ImageJ, >>>> concerning >>>> threads? >>>> >>>> The problems could be explained e.g. slower switching between >>>> threads >>>> in >>>> case one core has to handle several threads, or by some >>>> background >>>> activity that does not happen in plain ImageJ. >>>> I suspect different java options in the Launcher of Fiji vs. >>>> plain >>>> ImageJ, but I know nothing about Fiji. >>>> >>>> >>>> Michael >>>> ________________________________________________________________ >>>> >>>> >>>> On 13/02/2018 12:01, Philippe Mailly wrote: >>>>> Hi, >>>>> Some new tests. >>>>> >>>>> If I run Fiji in 32bits the median filter speed is similar to >>>>> other >>>>> computers. >>>>> In 64bits, all 2D filters (median, mean, variance ....) on a >>>>> image >>>>> stack >>>>> are very slow (40s) >>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) >>>>> take >>>>> only >>>>> 10s >>>>> >>>>> Philippe >>>>> >>>> -- >>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >>> -- >>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- *Philippe Mailly* /Phd, Research Engineer/ Imaging Core Facility CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France 11 place Marcelin Berthelot, 75005, PARIS, FRANCE -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
-Xmx785g
On Thu, 2018-02-15 at 17:12 +0100, Philippe Mailly wrote: > Hi Stephan, > > The computer on which I run the test have 1To of RAM memory should I > test with -Xmx8g or with -Xmx785g as it's configured in the Fiji > memory > and Threads options ? > > Philippe > > > > Le 15/02/2018 à 17:02, Saalfeld, Stephan a écrit : > > > > Thanks! Have you tried other garbage collectors? > > > > ./fiji -Xmx8g -Xincgc -- > > ./fiji -Xmx8g -XX:+UseParallelGC -- > > ./fiji -Xmx8g -XX:+UseConcMarkSweepGC -- > > > > The last one is my favorite for working with BDV and lazy caches, > > but > > utilizes parallel threads, not sure what Fiji's defaults are at > > this > > time. > > > > Thanks, > > Stephan > > > > > > On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote: > > > > > > Yes as you can see on the plots. > > > > > > Philippe > > > > > > > > > Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : > > > > > > > > Prime candidates: > > > > > > > > * parallel garbage collection > > > > * other JVM > > > > > > > > Have you tried to limit the number of threads to 1 or 2 less > > > > then > > > > the > > > > maximum number of processors if the number of available > > > > processors > > > > is > > > > very large? > > > > > > > > Thanks, > > > > Stephan > > > > > > > > > > > > > > > > On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: > > > > > > > > > > Hi everyone, > > > > > > > > > > after some off-list discussion and tests run by Philippe: > > > > > > > > > > The problem of unexpectedly poor performance on a Xeon 44- > > > > > core > > > > > machine > > > > > only appears in Fiji, not in plain ImageJ. > > > > > The problem is not limited to the "RankFilters" > > > > > (Process>Filters>Mean, > > > > > Median, Minimum, etc., everything that uses the "Circular > > > > > Masks" > > > > > for > > > > > the > > > > > neighborhood) but one can see parallelization problems also > > > > > in > > > > > other > > > > > functions, though not as severe. > > > > > > > > > > From all the evidence, it seems that sometimes one core or > > > > > a > > > > > few > > > > > cores > > > > > are not available for processing for rather long time (at > > > > > least > > > > > milliseconds, probably they do something else during that > > > > > period). > > > > > This > > > > > has especially bad consequences for the current RankFilters > > > > > because > > > > > they > > > > > expect all threads to work continuously and eventually the > > > > > other > > > > > threads > > > > > have to wait if one thread (core) is inactive. > > > > > (Due to optimized memory access, the RankFilters > > > > > parallelization > > > > > strategy has been very fast on machines like the Core i5 that > > > > > were > > > > > popular at the time when I developed it, but it gets poor > > > > > under > > > > > such > > > > > circumstances). > > > > > > > > > > So the question: What is different on Fiji vs. plain ImageJ, > > > > > concerning > > > > > threads? > > > > > > > > > > The problems could be explained e.g. slower switching between > > > > > threads > > > > > in > > > > > case one core has to handle several threads, or by some > > > > > background > > > > > activity that does not happen in plain ImageJ. > > > > > I suspect different java options in the Launcher of Fiji vs. > > > > > plain > > > > > ImageJ, but I know nothing about Fiji. > > > > > > > > > > > > > > > Michael > > > > > _____________________________________________________________ > > > > > ___ > > > > > > > > > > > > > > > On 13/02/2018 12:01, Philippe Mailly wrote: > > > > > > > > > > > > Hi, > > > > > > Some new tests. > > > > > > > > > > > > If I run Fiji in 32bits the median filter speed is similar > > > > > > to > > > > > > other > > > > > > computers. > > > > > > In 64bits, all 2D filters (median, mean, variance ....) on > > > > > > a > > > > > > image > > > > > > stack > > > > > > are very slow (40s) > > > > > > Same type of filters in 3D (median3D, mean3D, variance 3D > > > > > > ...) > > > > > > take > > > > > > only > > > > > > 10s > > > > > > > > > > > > Philippe > > > > > > > > > > > -- > > > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > > > -- > > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > -- > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html ImageJ mailing list: http://imagej.nih.gov/ij/list.html signature.asc (484 bytes) Download Attachment |
Here the plots. However, the option -XX:+UseParallelGC -- doesn't work it can't create a java virtual machine ?
Philippe ----- Mail original ----- De: "Saalfeld, Stephan" <[hidden email]> À: [hidden email] Envoyé: Jeudi 15 Février 2018 17:29:03 Objet: Re: very very very slow process (what is different in Fiji?) -Xmx785g On Thu, 2018-02-15 at 17:12 +0100, Philippe Mailly wrote: > Hi Stephan, > > The computer on which I run the test have 1To of RAM memory should I > test with -Xmx8g or with -Xmx785g as it's configured in the Fiji > memory > and Threads options ? > > Philippe > > > > Le 15/02/2018 à 17:02, Saalfeld, Stephan a écrit : > > > > Thanks! Have you tried other garbage collectors? > > > > ./fiji -Xmx8g -Xincgc -- > > ./fiji -Xmx8g -XX:+UseParallelGC -- > > ./fiji -Xmx8g -XX:+UseConcMarkSweepGC -- > > > > The last one is my favorite for working with BDV and lazy caches, > > but > > utilizes parallel threads, not sure what Fiji's defaults are at > > this > > time. > > > > Thanks, > > Stephan > > > > > > On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote: > > > > > > Yes as you can see on the plots. > > > > > > Philippe > > > > > > > > > Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : > > > > > > > > Prime candidates: > > > > > > > > * parallel garbage collection > > > > * other JVM > > > > > > > > Have you tried to limit the number of threads to 1 or 2 less > > > > then > > > > the > > > > maximum number of processors if the number of available > > > > processors > > > > is > > > > very large? > > > > > > > > Thanks, > > > > Stephan > > > > > > > > > > > > > > > > On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: > > > > > > > > > > Hi everyone, > > > > > > > > > > after some off-list discussion and tests run by Philippe: > > > > > > > > > > The problem of unexpectedly poor performance on a Xeon 44- > > > > > core > > > > > machine > > > > > only appears in Fiji, not in plain ImageJ. > > > > > The problem is not limited to the "RankFilters" > > > > > (Process>Filters>Mean, > > > > > Median, Minimum, etc., everything that uses the "Circular > > > > > Masks" > > > > > for > > > > > the > > > > > neighborhood) but one can see parallelization problems also > > > > > in > > > > > other > > > > > functions, though not as severe. > > > > > > > > > > From all the evidence, it seems that sometimes one core or > > > > > a > > > > > few > > > > > cores > > > > > are not available for processing for rather long time (at > > > > > least > > > > > milliseconds, probably they do something else during that > > > > > period). > > > > > This > > > > > has especially bad consequences for the current RankFilters > > > > > because > > > > > they > > > > > expect all threads to work continuously and eventually the > > > > > other > > > > > threads > > > > > have to wait if one thread (core) is inactive. > > > > > (Due to optimized memory access, the RankFilters > > > > > parallelization > > > > > strategy has been very fast on machines like the Core i5 that > > > > > were > > > > > popular at the time when I developed it, but it gets poor > > > > > under > > > > > such > > > > > circumstances). > > > > > > > > > > So the question: What is different on Fiji vs. plain ImageJ, > > > > > concerning > > > > > threads? > > > > > > > > > > The problems could be explained e.g. slower switching between > > > > > threads > > > > > in > > > > > case one core has to handle several threads, or by some > > > > > background > > > > > activity that does not happen in plain ImageJ. > > > > > I suspect different java options in the Launcher of Fiji vs. > > > > > plain > > > > > ImageJ, but I know nothing about Fiji. > > > > > > > > > > > > > > > Michael > > > > > _____________________________________________________________ > > > > > ___ > > > > > > > > > > > > > > > On 13/02/2018 12:01, Philippe Mailly wrote: > > > > > > > > > > > > Hi, > > > > > > Some new tests. > > > > > > > > > > > > If I run Fiji in 32bits the median filter speed is similar > > > > > > to > > > > > > other > > > > > > computers. > > > > > > In 64bits, all 2D filters (median, mean, variance ....) on > > > > > > a > > > > > > image > > > > > > stack > > > > > > are very slow (40s) > > > > > > Same type of filters in 3D (median3D, mean3D, variance 3D > > > > > > ...) > > > > > > take > > > > > > only > > > > > > 10s > > > > > > > > > > > > Philippe > > > > > > > > > > > -- > > > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > > > -- > > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > -- > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- Philippe Mailly PhD, Ingénieur de Recherche CNRS Plateforme d'imagerie du CIRB, CNRS UMR 7241 / INSERM U 1050 Collège de France 11 Place Marcelin Berthelot, 75005, Paris, France -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html 2DMedian timesFiji1.51u_44CPU-Xmx780g-Xincgc.tif (293K) Download Attachment 2DMedian timesFiji1.51u_44CPU-Xmx780g-XXUseConcMarkSweepGC.tif (293K) Download Attachment |
In reply to this post by Saalfeld, Stephan
Hi everyone,
after a bit of further analysis of the Fiji vs. ImageJ performance by Philippe (supplemented by some ideas of mine): Parallelization performance is sometimes very different between plain ImageJ and Fiji (this differs a lot between different filters/functions, and strongly depends on the size of the data set). Many times, plain ImageJ is better, sometimes Fiji. Performance often differs between ImageJ and Fiji by factors of five or more! The garbage collector seems to make no or not much difference. It seems that usually it does nothing - no wonder when processing half-GB image stacks on a machine with TB memory. Startup options in Fiji's jvm.cfg only specify 'server' mode, but also plain ImageJ runs a 'server' JVM. So also the jvm.cfg isn't responsible for the differences. While the java version is the same for both, and both is Oracle java, there is a difference between plain ImageJ and Fiji: The java.home property is for ImageJ: C:\Program Files\Java\jre1.8.0_162 for Fiji: C:\Program Files\Java\jdk1.8.0_162\jre Also, the library paths etc. are different (pointing to somewhere in the respective java.home folder): java.ext.dirs sun.boot.library.path java.endorsed.dirs Java experts out there, do you have any idea whether this could make the difference how Java handles multithreading performance? (e.g. things like scheduling different threads, etc.) Are there java options to tweak it? Michael ________________________________________________________________ On 15/02/2018 17:02, Saalfeld, Stephan wrote: > Thanks! Have you tried other garbage collectors? > > ./fiji -Xmx8g -Xincgc -- > ./fiji -Xmx8g -XX:+UseParallelGC -- > ./fiji -Xmx8g -XX:+UseConcMarkSweepGC -- > > The last one is my favorite for working with BDV and lazy caches, but > utilizes parallel threads, not sure what Fiji's defaults are at this > time. > > Thanks, > Stephan > > > On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote: >> Yes as you can see on the plots. >> >> Philippe >> >> >> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : >>> >>> Prime candidates: >>> >>> * parallel garbage collection >>> * other JVM >>> >>> Have you tried to limit the number of threads to 1 or 2 less then >>> the >>> maximum number of processors if the number of available processors >>> is >>> very large? >>> >>> Thanks, >>> Stephan >>> >>> >>> >>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: >>>> >>>> Hi everyone, >>>> >>>> after some off-list discussion and tests run by Philippe: >>>> >>>> The problem of unexpectedly poor performance on a Xeon 44-core >>>> machine >>>> only appears in Fiji, not in plain ImageJ. >>>> The problem is not limited to the "RankFilters" >>>> (Process>Filters>Mean, >>>> Median, Minimum, etc., everything that uses the "Circular Masks" >>>> for >>>> the >>>> neighborhood) but one can see parallelization problems also in >>>> other >>>> functions, though not as severe. >>>> >>>> From all the evidence, it seems that sometimes one core or a >>>> few >>>> cores >>>> are not available for processing for rather long time (at least >>>> milliseconds, probably they do something else during that >>>> period). >>>> This >>>> has especially bad consequences for the current RankFilters >>>> because >>>> they >>>> expect all threads to work continuously and eventually the other >>>> threads >>>> have to wait if one thread (core) is inactive. >>>> (Due to optimized memory access, the RankFilters parallelization >>>> strategy has been very fast on machines like the Core i5 that >>>> were >>>> popular at the time when I developed it, but it gets poor under >>>> such >>>> circumstances). >>>> >>>> So the question: What is different on Fiji vs. plain ImageJ, >>>> concerning >>>> threads? >>>> >>>> The problems could be explained e.g. slower switching between >>>> threads >>>> in >>>> case one core has to handle several threads, or by some >>>> background >>>> activity that does not happen in plain ImageJ. >>>> I suspect different java options in the Launcher of Fiji vs. >>>> plain >>>> ImageJ, but I know nothing about Fiji. >>>> >>>> >>>> Michael >>>> ________________________________________________________________ >>>> >>>> >>>> On 13/02/2018 12:01, Philippe Mailly wrote: >>>>> >>>>> Hi, >>>>> Some new tests. >>>>> >>>>> If I run Fiji in 32bits the median filter speed is similar to >>>>> other >>>>> computers. >>>>> In 64bits, all 2D filters (median, mean, variance ....) on a >>>>> image >>>>> stack >>>>> are very slow (40s) >>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) >>>>> take >>>>> only >>>>> 10s >>>>> >>>>> Philippe >>>>> >>>> -- >>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >>> -- >>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Hi everyone (especially Fiji experts!),
the attachment shows an example of Fiji vs. plain ImageJ parallelization performance. For this, I have already changed the RankFilters (which doe the medain) to be less sensitive to cases where one thread gets stuck, otherwise the Fiji performance with a large number of processors would be even worse. In both cases, it runs at the default priority=4. Maybe the following information provides a few more clues: - The problem is found for a Windows 10 machine (two Xeon E5-2699 v4, 2*22 cores + hyperthreading, 1 TB) - No such problem on a slightly slower Windows 7 machine with Xeon E5-2630v3 processors (max. 32 threads), 64 GB RAM. It can't be a problem of the computer or Windows 10, however, because plain ImageJ works well, the poor parallelization performance occurs only under FIJI. So again, what could be different between plain ImageJ and Fiji? (At least in theory, thread scheduling, time slices, etc. should be done by the operating system, so the Win 10 vs. Win 7 difference should affect plain ImageJ and Fiji the same way). Michael ________________________________________________________________ On 19/02/2018 16:46, Michael Schmid wrote: > Hi everyone, > > after a bit of further analysis of the Fiji vs. ImageJ performance by > Philippe (supplemented by some ideas of mine): > > Parallelization performance is sometimes very different between plain > ImageJ and Fiji (this differs a lot between different filters/functions, > and strongly depends on the size of the data set). Many times, plain > ImageJ is better, sometimes Fiji. > Performance often differs between ImageJ and Fiji by factors of five or > more! > > The garbage collector seems to make no or not much difference. It seems > that usually it does nothing - no wonder when processing half-GB image > stacks on a machine with TB memory. > Startup options in Fiji's jvm.cfg only specify 'server' mode, but also > plain ImageJ runs a 'server' JVM. So also the jvm.cfg isn't responsible > for the differences. > > While the java version is the same for both, and both is Oracle java, > there is a difference between plain ImageJ and Fiji: > The java.home property is > for ImageJ: C:\Program Files\Java\jre1.8.0_162 > for Fiji: C:\Program Files\Java\jdk1.8.0_162\jre > > Also, the library paths etc. are different (pointing to somewhere in the > respective java.home folder): > java.ext.dirs > sun.boot.library.path > java.endorsed.dirs > > > Java experts out there, do you have any idea whether this could make the > difference how Java handles multithreading performance? > (e.g. things like scheduling different threads, etc.) > Are there java options to tweak it? > > > Michael > ________________________________________________________________ > On 15/02/2018 17:02, Saalfeld, Stephan wrote: >> Thanks! Have you tried other garbage collectors? >> >> ./fiji -Xmx8g -Xincgc -- >> ./fiji -Xmx8g -XX:+UseParallelGC -- >> ./fiji -Xmx8g -XX:+UseConcMarkSweepGC -- >> >> The last one is my favorite for working with BDV and lazy caches, but >> utilizes parallel threads, not sure what Fiji's defaults are at this >> time. >> >> Thanks, >> Stephan >> >> >> On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote: >>> Yes as you can see on the plots. >>> >>> Philippe >>> >>> >>> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : >>>> >>>> Prime candidates: >>>> >>>> * parallel garbage collection >>>> * other JVM >>>> >>>> Have you tried to limit the number of threads to 1 or 2 less then >>>> the >>>> maximum number of processors if the number of available processors >>>> is >>>> very large? >>>> >>>> Thanks, >>>> Stephan >>>> >>>> >>>> >>>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: >>>>> >>>>> Hi everyone, >>>>> >>>>> after some off-list discussion and tests run by Philippe: >>>>> >>>>> The problem of unexpectedly poor performance on a Xeon 44-core >>>>> machine >>>>> only appears in Fiji, not in plain ImageJ. >>>>> The problem is not limited to the "RankFilters" >>>>> (Process>Filters>Mean, >>>>> Median, Minimum, etc., everything that uses the "Circular Masks" >>>>> for >>>>> the >>>>> neighborhood) but one can see parallelization problems also in >>>>> other >>>>> functions, though not as severe. >>>>> >>>>> From all the evidence, it seems that sometimes one core or a >>>>> few >>>>> cores >>>>> are not available for processing for rather long time (at least >>>>> milliseconds, probably they do something else during that >>>>> period). >>>>> This >>>>> has especially bad consequences for the current RankFilters >>>>> because >>>>> they >>>>> expect all threads to work continuously and eventually the other >>>>> threads >>>>> have to wait if one thread (core) is inactive. >>>>> (Due to optimized memory access, the RankFilters parallelization >>>>> strategy has been very fast on machines like the Core i5 that >>>>> were >>>>> popular at the time when I developed it, but it gets poor under >>>>> such >>>>> circumstances). >>>>> >>>>> So the question: What is different on Fiji vs. plain ImageJ, >>>>> concerning >>>>> threads? >>>>> >>>>> The problems could be explained e.g. slower switching between >>>>> threads >>>>> in >>>>> case one core has to handle several threads, or by some >>>>> background >>>>> activity that does not happen in plain ImageJ. >>>>> I suspect different java options in the Launcher of Fiji vs. >>>>> plain >>>>> ImageJ, but I know nothing about Fiji. >>>>> >>>>> >>>>> Michael >>>>> ________________________________________________________________ >>>>> >>>>> >>>>> On 13/02/2018 12:01, Philippe Mailly wrote: >>>>>> >>>>>> Hi, >>>>>> Some new tests. >>>>>> >>>>>> If I run Fiji in 32bits the median filter speed is similar to >>>>>> other >>>>>> computers. >>>>>> In 64bits, all 2D filters (median, mean, variance ....) on a >>>>>> image >>>>>> stack >>>>>> are very slow (40s) >>>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) >>>>>> take >>>>>> only >>>>>> 10s >>>>>> >>>>>> Philippe >>>>>> >>>>> -- >>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >>>> -- >>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >> >> -- >> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >> ImageJ mailing list: http://imagej.nih.gov/ij/list.html 2DMedian_times_FIji_vs_ImageJ.png (27K) Download Attachment |
In reply to this post by Michael Schmid
Hi Michael,
I'm sorry, but I don't have much insight into the problem, nor any time to investigate it. I agree with Stephan that the parameters used to launch the JVM are paramount. I did not see any details posted about the exact invocations to java that were used? If I were investigating this, I would definitely want to do the benchmarks after invoking java directly, rather than via the ImageJ launcher. We want to compare apples to apples. It would also likely be valuable to profile the code to see where the bottlenecks are, comparing plain IJ1 with IJ2. See https://imagej.net/Profiling for some approaches. Regards, Curtis -- Curtis Rueden LOCI software architect - https://loci.wisc.edu/software ImageJ2 lead, Fiji maintainer - https://imagej.net/User:Rueden Did you know ImageJ has a forum? http://forum.imagej.net/ On Tue, Feb 20, 2018 at 7:53 AM, Michael Schmid <[hidden email]> wrote: > Hi everyone (especially Fiji experts!), > > the attachment shows an example of Fiji vs. plain ImageJ parallelization > performance. For this, I have already changed the RankFilters (which doe > the medain) to be less sensitive to cases where one thread gets stuck, > otherwise the Fiji performance with a large number of processors would be > even worse. > > In both cases, it runs at the default priority=4. > > Maybe the following information provides a few more clues: > - The problem is found for a Windows 10 machine (two Xeon E5-2699 v4, 2*22 > cores + hyperthreading, 1 TB) > - No such problem on a slightly slower Windows 7 machine with Xeon > E5-2630v3 processors (max. 32 threads), 64 GB RAM. > > It can't be a problem of the computer or Windows 10, however, because > plain ImageJ works well, the poor parallelization performance occurs only > under FIJI. > > So again, what could be different between plain ImageJ and Fiji? > (At least in theory, thread scheduling, time slices, etc. should be done > by the operating system, so the Win 10 vs. Win 7 difference should affect > plain ImageJ and Fiji the same way). > > > Michael > ________________________________________________________________ > > > > On 19/02/2018 16:46, Michael Schmid wrote: > >> Hi everyone, >> >> after a bit of further analysis of the Fiji vs. ImageJ performance by >> Philippe (supplemented by some ideas of mine): >> >> Parallelization performance is sometimes very different between plain >> ImageJ and Fiji (this differs a lot between different filters/functions, >> and strongly depends on the size of the data set). Many times, plain ImageJ >> is better, sometimes Fiji. >> Performance often differs between ImageJ and Fiji by factors of five or >> more! >> >> The garbage collector seems to make no or not much difference. It seems >> that usually it does nothing - no wonder when processing half-GB image >> stacks on a machine with TB memory. >> Startup options in Fiji's jvm.cfg only specify 'server' mode, but also >> plain ImageJ runs a 'server' JVM. So also the jvm.cfg isn't responsible for >> the differences. >> >> While the java version is the same for both, and both is Oracle java, >> there is a difference between plain ImageJ and Fiji: >> The java.home property is >> for ImageJ: C:\Program Files\Java\jre1.8.0_162 >> for Fiji: C:\Program Files\Java\jdk1.8.0_162\jre >> >> Also, the library paths etc. are different (pointing to somewhere in the >> respective java.home folder): >> java.ext.dirs >> sun.boot.library.path >> java.endorsed.dirs >> >> >> Java experts out there, do you have any idea whether this could make the >> difference how Java handles multithreading performance? >> (e.g. things like scheduling different threads, etc.) >> Are there java options to tweak it? >> >> >> Michael >> ________________________________________________________________ >> On 15/02/2018 17:02, Saalfeld, Stephan wrote: >> >>> Thanks! Have you tried other garbage collectors? >>> >>> ./fiji -Xmx8g -Xincgc -- >>> ./fiji -Xmx8g -XX:+UseParallelGC -- >>> ./fiji -Xmx8g -XX:+UseConcMarkSweepGC -- >>> >>> The last one is my favorite for working with BDV and lazy caches, but >>> utilizes parallel threads, not sure what Fiji's defaults are at this >>> time. >>> >>> Thanks, >>> Stephan >>> >>> >>> On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote: >>> >>>> Yes as you can see on the plots. >>>> >>>> Philippe >>>> >>>> >>>> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit : >>>> >>>>> >>>>> Prime candidates: >>>>> >>>>> * parallel garbage collection >>>>> * other JVM >>>>> >>>>> Have you tried to limit the number of threads to 1 or 2 less then >>>>> the >>>>> maximum number of processors if the number of available processors >>>>> is >>>>> very large? >>>>> >>>>> Thanks, >>>>> Stephan >>>>> >>>>> >>>>> >>>>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote: >>>>> >>>>>> >>>>>> Hi everyone, >>>>>> >>>>>> after some off-list discussion and tests run by Philippe: >>>>>> >>>>>> The problem of unexpectedly poor performance on a Xeon 44-core >>>>>> machine >>>>>> only appears in Fiji, not in plain ImageJ. >>>>>> The problem is not limited to the "RankFilters" >>>>>> (Process>Filters>Mean, >>>>>> Median, Minimum, etc., everything that uses the "Circular Masks" >>>>>> for >>>>>> the >>>>>> neighborhood) but one can see parallelization problems also in >>>>>> other >>>>>> functions, though not as severe. >>>>>> >>>>>> From all the evidence, it seems that sometimes one core or a >>>>>> few >>>>>> cores >>>>>> are not available for processing for rather long time (at least >>>>>> milliseconds, probably they do something else during that >>>>>> period). >>>>>> This >>>>>> has especially bad consequences for the current RankFilters >>>>>> because >>>>>> they >>>>>> expect all threads to work continuously and eventually the other >>>>>> threads >>>>>> have to wait if one thread (core) is inactive. >>>>>> (Due to optimized memory access, the RankFilters parallelization >>>>>> strategy has been very fast on machines like the Core i5 that >>>>>> were >>>>>> popular at the time when I developed it, but it gets poor under >>>>>> such >>>>>> circumstances). >>>>>> >>>>>> So the question: What is different on Fiji vs. plain ImageJ, >>>>>> concerning >>>>>> threads? >>>>>> >>>>>> The problems could be explained e.g. slower switching between >>>>>> threads >>>>>> in >>>>>> case one core has to handle several threads, or by some >>>>>> background >>>>>> activity that does not happen in plain ImageJ. >>>>>> I suspect different java options in the Launcher of Fiji vs. >>>>>> plain >>>>>> ImageJ, but I know nothing about Fiji. >>>>>> >>>>>> >>>>>> Michael >>>>>> ________________________________________________________________ >>>>>> >>>>>> >>>>>> On 13/02/2018 12:01, Philippe Mailly wrote: >>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> Some new tests. >>>>>>> >>>>>>> If I run Fiji in 32bits the median filter speed is similar to >>>>>>> other >>>>>>> computers. >>>>>>> In 64bits, all 2D filters (median, mean, variance ....) on a >>>>>>> image >>>>>>> stack >>>>>>> are very slow (40s) >>>>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) >>>>>>> take >>>>>>> only >>>>>>> 10s >>>>>>> >>>>>>> Philippe >>>>>>> >>>>>>> -- >>>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >>>>>> >>>>> -- >>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >>>>> >>>> >>> -- >>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html >>> >>> > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Free forum by Nabble | Edit this page |