Login  Register

Re: Parallel image processing using ImageJ in the Cloud

Posted by Vitali Khvatkov-2 on Apr 07, 2011; 4:15pm
URL: http://imagej.273.s1.nabble.com/Parallel-image-processing-using-ImageJ-in-the-Cloud-tp3685113p3685120.html

In out tests for parallel image processing with ImageJ on Amazon Could (as
well as on dedicated servers)  major performance hurdles were:
- Overloading HDD I/O with memory swapping (so we had to do some trickery
with RAID0 stuctures using Amazone EBS drives)
-  High process overhead (say 10% load by plugin code and 90% by ImageJ
starting and loading all its goodies even if we dont need any)
- Luck of code execution process control within ImageJ , mentioned by other
posters. To do "Reduce" step we need to known when process is done

NOTE: Amazon (and other clouds we seen), have high latency storage (even EBS
drives not to mention buckets) so this fact has to be considered.

Vitali.
Smart Imaging Technologies
713-589-3500
live.simagis.com

On Thu, Apr 7, 2011 at 10:30 AM, Jimmy Su <[hidden email]> wrote:

> On Wed, Apr 6, 2011 at 2:00 PM, Gary Sellani <[hidden email]> wrote:
> > I should point out here that 8 core CPUs (AMD bulldozer) are due in June.
> A dual CPU bulldozer would be 16 cores and not depend on the cloud.
> >
> > Basically what is powering the speed increase is the parallel processing,
> not the cloud itself.
> >
>
> Gary's point is right on.  The Cloud is just another parallel
> computing platform.  Hadoop implements the MapReduce parallel
> processing framework, which has been used successfully in several
> domains.
>
> Sometimes a problem is not CPU bound.  It may be limited by the amount
> of memory or disk on a single machine.  In those cases, having more
> cores on a single processor is not going to help.  We ran into this
> problem when we were parallelizing a decision tree training algorithm,
> where it requires the training samples to be in memory.
>
> Jimmy
>
> > -----Original Message-----
> > From: Dean Kossives <[hidden email]>
> > Sender: ImageJ Interest Group <[hidden email]>
> > Date: Wed, 6 Apr 2011 16:18:23
> > To: <[hidden email]>
> > Reply-To: ImageJ Interest Group <[hidden email]>
> > Subject: Re: Parallel image processing using ImageJ in the Cloud
> >
> > Every now and then something comes along that catches my eye. A
> > reduction in processing time from 5 hours to 15 min is great.
> >
> > 1) can you get the processing time down to under 1 minute?
> >
> > Dean P. Kossives
> > Principal Engineer
> > Clear Align
> > 2550 Boulevard of the Generals, Suite 280
> > Eagleville, PA 19403
> > phone 484 956 0510 X185
> > fax  484 956 0511
> > mailto:[hidden email]
> > www.clearalign.com
> >
> > Clear Align is certified as a SB, 8(a) SDB & WOSB.
> > See us at SPIE Defense, Security, and Sensing (Booth 1017), April 25-29,
> > 2011
> >
> > This message and any attachments are solely for the use of the
> > individual or entity to which it is addressed and may contain
> > information that is privileged or confidential.  If you are not the
> > intended recipient, any disclosure, use or distribution of the
> > information contained herein is prohibited. If you have received this
> > communication in error, please notify the sender by reply e-mail and
> > immediately delete this message and any attachments.  In the event this
> > document(s) contains technical data within the definition of the
> > International Traffic in Arms Regulations, it is subject to the export
> > control laws of the U.S. Government.  Transfer of this data by any means
> > to a foreign person, whether in the United States or abroad, without an
> > export license or other approval from the U.S. Department of State, is
> > prohibited.
> >
> >  P Please consider your environmental responsibilities before printing
> > this e-mail
> >
> >
> > -----Original Message-----
> > From: ImageJ Interest Group [mailto:[hidden email]] On Behalf Of
> > Johannes Schindelin
> > Sent: Wednesday, April 06, 2011 6:25 AM
> > To: [hidden email]
> > Subject: Re: Parallel image processing using ImageJ in the Cloud
> >
> > Hi Jimmy,
> >
> > On Mon, 4 Apr 2011, Jimmy Su wrote:
> >
> >> We recently completed a Phase 1 SBIR project with OSD on the topic of
> >> analytic tools in the Cloud.
> >
> > I only understood half the words in that sentence, but I'm quite used to
> >
> > that :-)
> >
> >> To demonstrate our tool's ability to construct image processing
> > workflow
> >> and deploy the generated code to the Cloud, we took ImageJ and added
> >> some Cloud processing capabilities by using the MapReduce framework.
> >
> > What exactly did you do in terms of image processing? Some Gaussian
> > Blur,
> > or Find Edges, or some advanced plug-in? From a technical point of view,
> >
> > there are huge differences there.
> >
> >> We added Cloud processing capability to ImageJ by adding a Hadoop
> >> InputFormat to handle image types in HDFS (Hadoop Distributed File
> >> System) and encapsulating ImageJ operations in map and reduce methods.
> >
> > That is_very_ interesting. For a long time I have been wanting to play
> > with Hadoop now.
> >
> >> This significantly increases ImageJ's throughput in processing
> >> images.  Attached is the running time chart showing processing time
> >> decreases from over 5 hours on two nodes to 15 minutes on 64 nodes on
> >> Amazon EC2.  Are there any interests in the ImageJ community for
> >> parallel processing in the Cloud?  What kind of applications are you
> >> developing that needs ImageJ processing in the Cloud?  We would love
> >> to hear your feedback.
> >
> > Two feedbacks from my side:
> >
> > 1) fantastic!
> > 2) where can I get it?
> >
> > Ciao,
> > Johannes
> >
>



--
Vitali Khvatkov
Smart Imaging Technologies Co.