imageJ + distributed computing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

imageJ + distributed computing

Sean, Founder CloudSpree
Does anyone have experience distributing imageJ across multiple computers or
a cluster to process a very large number of images?

Thanks,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

Albert Cardona-2
2010/9/26 Sean, Founder CloudSpree <[hidden email]>:
> Does anyone have experience distributing imageJ across multiple computers or
> a cluster to process a very large number of images?

Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame buffer).
At the end, run a closing job that pools all results of the individual jobs.

Fiji by the way is able to launch ImageJ without requiring a virtual
frame buffer (use --headless option).

Albert
--
http://albert.rierol.net
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

suendermann
Hello,

as far as I know there is no possibility for ImageJ to distribute
processes over more than one node/computer. So you have to start
ImageJ/Fiji on every node/computer separatly. Is that right?

Greetings
   Fred

On 26.09.2010 17:21, Albert Cardona wrote:

> 2010/9/26 Sean, Founder CloudSpree<[hidden email]>:
>> Does anyone have experience distributing imageJ across multiple computers or
>> a cluster to process a very large number of images?
>
> Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame buffer).
> At the end, run a closing job that pools all results of the individual jobs.
>
> Fiji by the way is able to launch ImageJ without requiring a virtual
> frame buffer (use --headless option).
>
> Albert
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

Sean, Founder CloudSpree
Is this something many imageJ users would be interested in?

I ask as I am contemplating developing a solution but have no desire to
reinvent the wheel.

Are many people waiting hours/days/weeks to batch process images?

Thanks,
Sean

On Wed, Sep 29, 2010 at 5:47 AM, suendermann <
[hidden email]> wrote:

> Hello,
>
> as far as I know there is no possibility for ImageJ to distribute processes
> over more than one node/computer. So you have to start ImageJ/Fiji on every
> node/computer separatly. Is that right?
>
> Greetings
>  Fred
>
>
> On 26.09.2010 17:21, Albert Cardona wrote:
>
>> 2010/9/26 Sean, Founder CloudSpree<[hidden email]>:
>>
>>> Does anyone have experience distributing imageJ across multiple computers
>>> or
>>> a cluster to process a very large number of images?
>>>
>>
>> Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame
>> buffer).
>> At the end, run a closing job that pools all results of the individual
>> jobs.
>>
>> Fiji by the way is able to launch ImageJ without requiring a virtual
>> frame buffer (use --headless option).
>>
>> Albert
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

Guha, Rajarshi (NIH/NHGRI) [C]
This sounds like a task for a Hadoop or Terracotta solution

Personally, I'd like to see something on these lines (over time we will need
t batch process large quantities of images)


On 9/29/10 4:47 PM, "Sean, Founder CloudSpree" <[hidden email]> wrote:

> Is this something many imageJ users would be interested in?
>
> I ask as I am contemplating developing a solution but have no desire to
> reinvent the wheel.
>
> Are many people waiting hours/days/weeks to batch process images?
>
> Thanks,
> Sean
>
> On Wed, Sep 29, 2010 at 5:47 AM, suendermann <
> [hidden email]> wrote:
>
>> Hello,
>>
>> as far as I know there is no possibility for ImageJ to distribute processes
>> over more than one node/computer. So you have to start ImageJ/Fiji on every
>> node/computer separatly. Is that right?
>>
>> Greetings
>>  Fred
>>
>>
>> On 26.09.2010 17:21, Albert Cardona wrote:
>>
>>> 2010/9/26 Sean, Founder CloudSpree<[hidden email]>:
>>>
>>>> Does anyone have experience distributing imageJ across multiple computers
>>>> or
>>>> a cluster to process a very large number of images?
>>>>
>>>
>>> Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame
>>> buffer).
>>> At the end, run a closing job that pools all results of the individual
>>> jobs.
>>>
>>> Fiji by the way is able to launch ImageJ without requiring a virtual
>>> frame buffer (use --headless option).
>>>
>>> Albert
>>>
>>

--
Rajarshi Guha
NIH Chemical Genomics Center
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

dscho
In reply to this post by Sean, Founder CloudSpree
Hi,

On Wed, 29 Sep 2010, Sean, Founder CloudSpree wrote:

> Is this something many imageJ users would be interested in?

I would be interested.

> I ask as I am contemplating developing a solution but have no desire to
> reinvent the wheel.
>
> Are many people waiting hours/days/weeks to batch process images?

The big problem is to formulate a simple yet powerful interface for this
which is feasible.

For example, you will never be able to parallelize in general a given
macro. But you could provide a way to do the Process>Batch>Macro... in a
truly parallel manner, say, on a cluster.

However. The big problem is that ImageJ was designed as a desktop
application, and therefore many plugins require a graphical desktop even
if they do not display anything. For example, if you instantiate any
java.awt.Dialog on Linux, you need a graphical desktop.

We tried to work around this issue in the 'headless' hack in Fiji
(basically providing fake Dialog and Menus classes), but fact is: many
plugins will fail to run with that hack, because they access more
functionality than we provide, e.g. registering callbacks such as
constraining the aspect ratio when downsampling. And our fake Dialog
class does not provide all required fake methods for that.

Oh, and by the way, if anybody wants to throw about with buzzwords such as
map/reduce, I would be interested in _concrete_ suggestions how that
should work, rather than just names of implementations of the paradigm. I
am interested in a concrete such suggestion, because the map/reduce
paradigm and ImageJ's plugin infrastructure appear to be completely
incompatible to me. But I have been wrong before. So: prove me wrong.
Concretely, not with PR speak.

Ciao,
Johannes
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

suendermann
Hi,

I would be interested in too!

> For example, you will never be able to parallelize in general a given
> macro. But you could provide a way to do the Process>Batch>Macro... in a
> truly parallel manner, say, on a cluster.

Maybe I don't get the point, but what do you mean with cluster? Just one
PC with several cpus or a bigger thing with many nodes each with its own
cpus?
We tried to get it run on a machine with 8 nodes with 8 CPUs each and
had to find out, that ImageJ is not able to communicate over nodes. It
stuck within the node and its cpus.
So our only solution was to start it on every node separatly and use a
script to distribute to "jobs" to the nodes.

> However. The big problem is that ImageJ was designed as a desktop
> application, and therefore many plugins require a graphical desktop even
> if they do not display anything. For example, if you instantiate any
> java.awt.Dialog on Linux, you need a graphical desktop.
>
> We tried to work around this issue in the 'headless' hack in Fiji
> (basically providing fake Dialog and Menus classes), but fact is: many
> plugins will fail to run with that hack, because they access more
> functionality than we provide, e.g. registering callbacks such as
> constraining the aspect ratio when downsampling. And our fake Dialog
> class does not provide all required fake methods for that.

Do you plan to improve your fake class or do you search for another
solution?

Greetings
   Fred
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

dscho
Hi,

On Thu, 30 Sep 2010, suendermann wrote:

> > For example, you will never be able to parallelize in general a given
> > macro. But you could provide a way to do the Process>Batch>Macro... in
> > a truly parallel manner, say, on a cluster.
>
> Maybe I don't get the point, but what do you mean with cluster?

Except for some enterprisey marketing types's new definitions, a cluster
is a well defined entity: a couple of computers with identical
architecture and operating system (so-called nodes), controlled by a
master node, which gets sent jobs using a so-called scheduler.

> Just one PC with several cpus or a bigger thing with many nodes each
> with its own cpus? We tried to get it run on a machine with 8 nodes with
> 8 CPUs each and had to find out, that ImageJ is not able to communicate
> over nodes. It stuck within the node and its cpus.

I tried to get over the point in my previous mail: it is _not_ ImageJ! It
is the _plugin_! ImageJ would be happy to run on as many cores as you let
it. But ImageJ is inherently sequential, as it is designed as a desktop
application (which I probably mentioned before).

> So our only solution was to start it on every node separatly and use a
> script to distribute to "jobs" to the nodes.

If you thought that there would be an automatic way to distribute
the work load of a sequential macro magically onto a number of cores, then
I have to disappoint you. It is easy to construct pathological examples
where each step of the macro depends on the previous step, and where each
individual step is not parallelizable. There is simply no way to
parallelize this thing automatically.

> > However. The big problem is that ImageJ was designed as a desktop
> > application, and therefore many plugins require a graphical desktop
> > even if they do not display anything. For example, if you instantiate
> > any java.awt.Dialog on Linux, you need a graphical desktop.
> >
> > We tried to work around this issue in the 'headless' hack in Fiji
> > (basically providing fake Dialog and Menus classes), but fact is: many
> > plugins will fail to run with that hack, because they access more
> > functionality than we provide, e.g. registering callbacks such as
> > constraining the aspect ratio when downsampling. And our fake Dialog
> > class does not provide all required fake methods for that.
>
> Do you plan to improve your fake class or do you search for another
> solution?

I will improve the fake Dialog and Menus classes as I need to. I will also
happily accept improvements others needed to do. But it is a hack, and
every hack shows its limitations if you use it too hard.

The proper solution is a redesign of the plugin mechanism to allow
headless execution (this is underway as far as I understand), but of
course this will only work with plugins written using the new interface.
The old plugins will continue to have the problem.

Hth,
Johannes
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

suendermann
Hi,

> On Thu, 30 Sep 2010, suendermann wrote:
>
>>> For example, you will never be able to parallelize in general a given
>>> macro. But you could provide a way to do the Process>Batch>Macro... in
>>> a truly parallel manner, say, on a cluster.
>>
>> Maybe I don't get the point, but what do you mean with cluster?
>
> Except for some enterprisey marketing types's new definitions, a cluster
> is a well defined entity: a couple of computers with identical
> architecture and operating system (so-called nodes), controlled by a
> master node, which gets sent jobs using a so-called scheduler.

I'm aware what a cluster is. I just wanted to be sure we are speaking
about the same thing and not about cloud or a thing like the sun
grid-engine.

>> Just one PC with several cpus or a bigger thing with many nodes each
>> with its own cpus? We tried to get it run on a machine with 8 nodes with
>> 8 CPUs each and had to find out, that ImageJ is not able to communicate
>> over nodes. It stuck within the node and its cpus.
>
> I tried to get over the point in my previous mail: it is _not_ ImageJ! It
> is the _plugin_! ImageJ would be happy to run on as many cores as you let
> it. But ImageJ is inherently sequential, as it is designed as a desktop
> application (which I probably mentioned before).

Maybe I'm a little bit brain-blocked today, but eventually we don't
speak about the same. Running ImageJ on many cores isn't a problem. I
even don't want to parallelize a macro. I' interested in building a
plugin that uses as many cores as possible, like the
ExtendedPlugInFilter. At this point I didn't get it run on more than 8
cores (each of our nodes has 8 cores). If you did so, could you give me
a hint?

>> So our only solution was to start it on every node separatly and use a
>> script to distribute to "jobs" to the nodes.
>
> If you thought that there would be an automatic way to distribute
> the work load of a sequential macro magically onto a number of cores, then
> I have to disappoint you. It is easy to construct pathological examples
> where each step of the macro depends on the previous step, and where each
> individual step is not parallelizable. There is simply no way to
> parallelize this thing automatically.

No, I don't like magic... I want to control such stuff. Maybe Fiji's
jython interface is a starting point for that. It should be easy to
include the functions/plugins you need in a python script and distribute
that across your cluster.

>>> However. The big problem is that ImageJ was designed as a desktop
>>> application, and therefore many plugins require a graphical desktop
>>> even if they do not display anything. For example, if you instantiate
>>> any java.awt.Dialog on Linux, you need a graphical desktop.
>>>
>>> We tried to work around this issue in the 'headless' hack in Fiji
>>> (basically providing fake Dialog and Menus classes), but fact is: many
>>> plugins will fail to run with that hack, because they access more
>>> functionality than we provide, e.g. registering callbacks such as
>>> constraining the aspect ratio when downsampling. And our fake Dialog
>>> class does not provide all required fake methods for that.
>>
>> Do you plan to improve your fake class or do you search for another
>> solution?
>
> I will improve the fake Dialog and Menus classes as I need to. I will also
> happily accept improvements others needed to do. But it is a hack, and
> every hack shows its limitations if you use it too hard.
>
> The proper solution is a redesign of the plugin mechanism to allow
> headless execution (this is underway as far as I understand)

This sounds great!

> but of
> course this will only work with plugins written using the new interface.
> The old plugins will continue to have the problem.
>
> Hth,
> Johannes

Greetings
   Fred
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

Arne Seitz
In reply to this post by Sean, Founder CloudSpree
I know a lot of people who might be interested in something like this.
Especially multi user facilities which are involved in high throughput imaging.

Two years ago I was facing the problem when I was working at the EMBL. Due to the mass of data (20 GB) it took about three day to analyze these data with ImageJ. In order to speed this process up and make it more stable I developed the following work around: I started ImageJ on upto six different computers and started plugin. This plugin was doing the following: Reading a textfile (which images still needs to be processed), opening the image, calling a makro to process the image, save the results and writing to the above mentioned file (which image it finished processing). As I said more a workaround. But it worked and images could be processed in less than a day. I'm happy to provide the code of the plugin if you are interested.

Cheers
Arne


-----Original Message-----
From: ImageJ Interest Group [mailto:[hidden email]] On Behalf Of Sean, Founder CloudSpree
Sent: mercredi 29 septembre 2010 22:47
To: [hidden email]
Subject: Re: imageJ + distributed computing

Is this something many imageJ users would be interested in?

I ask as I am contemplating developing a solution but have no desire to
reinvent the wheel.

Are many people waiting hours/days/weeks to batch process images?

Thanks,
Sean

On Wed, Sep 29, 2010 at 5:47 AM, suendermann <
[hidden email]> wrote:

> Hello,
>
> as far as I know there is no possibility for ImageJ to distribute processes
> over more than one node/computer. So you have to start ImageJ/Fiji on every
> node/computer separatly. Is that right?
>
> Greetings
>  Fred
>
>
> On 26.09.2010 17:21, Albert Cardona wrote:
>
>> 2010/9/26 Sean, Founder CloudSpree<[hidden email]>:
>>
>>> Does anyone have experience distributing imageJ across multiple computers
>>> or
>>> a cluster to process a very large number of images?
>>>
>>
>> Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame
>> buffer).
>> At the end, run a closing job that pools all results of the individual
>> jobs.
>>
>> Fiji by the way is able to launch ImageJ without requiring a virtual
>> frame buffer (use --headless option).
>>
>> Albert
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

Gabriel Lapointe-2
On Fri, 2010-10-01 at 21:40 +0000, Seitz Arne wrote:

> This plugin was doing the following: Reading a textfile (which images
> still needs to be processed), opening the image, calling a makro to
> process the image, save the results and writing to the above mentioned
> file (which image it finished processing).


The problem with that approach, _if one is not carfull_, is
syncrhonization. When 2 instances of ImageJ tries to access or worse
save the file at the same time that can leave unprocessed files.


Gabriel Lapointe, MSc.
Laboratoire de Luc DesGroseillers, PhD.
Pavillon Roger-Gaudry Local A-538
Département de biochimie
Faculté de Médecine de l'Université de Montréal
2900 boul. Édouard-Montpetit,
Montréal, Qc, H3T 1J4
Tel : (514) 343-6111 postes 5187, 5152, 5162 ou 1048
Fax : (514) 343-2210
[hidden email]
http://gabriellapointe.ca 
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

Mather
In reply to this post by Sean, Founder CloudSpree
Hi! Could all the problems be solved, if building a system, where all the nodes of the cluster would be seen as one computer? This means running ImageJ/Fiji on one "virtual supercomputer" with all the resources of the nodes. I ask this because the generated data i work with is 3x-5x times bigger then my available memory.....I dont' care if it works days on one project, all i want is to be able to load my whole stack and work with my whole stack. Thx!
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

lechristophe
Hi,

You could use a Virtual Stack (File > Import > TIFF Virtual Stack)
rather than a regular stack to solve the RAM problem. If your
processing doesn't need to actually have all the slices data in memory
at once, that could be a simple alternative ?

Christophe

On Fri, Mar 9, 2012 at 11:45, Mather <[hidden email]> wrote:

> Hi! Could all the problems be solved, if building a system, where all the
> nodes of the cluster would be seen as one computer? This means running
> ImageJ/Fiji on one "virtual supercomputer" with all the resources of the
> nodes. I ask this because the generated data i work with is 3x-5x times
> bigger then my available memory.....I dont' care if it works days on one
> project, all i want is to be able to load my whole stack and work with my
> whole stack. Thx!
>
> --
> View this message in context: http://imagej.1557.n6.nabble.com/imageJ-distributed-computing-tp3686733p4562026.html
> Sent from the ImageJ mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

Mather
Thx, I found that using virtualstack i can't get the wanted result. For example: Applying a bandpass filter on the whole stack won't give me the same result as chopping up the stack and going through the parts with the filter. An image is about 4 mb and if i've got a 1000 of them, it is pretty time-consuming to work through the parts. Anyway I've got the same effects with other "commands" so I have to work with no virtualstack. I' ve got a few x86 P4 systems, that's why i was thinking about a "cluster"
Reply | Threaded
Open this post in threaded view
|

Re: imageJ + distributed computing

flavio
Hi

My name is Flavio. I'm part of 4Quant. We do parallel, distributed, real-time image processing and quantitative image analysis.

Leveraging the large existing communities built around ImageJ and FIJI, Cloud Image Processing (CIP) lets you scale up these tools from one machine to hundreds in a distributed, fault-tolerant manner. Whether with powerful workstations, high-performance clusters, or cloud-based machines, our framework can handle your problem and make analysis and configuration easy using any device with our web-based tools.

Please feel free to contact me for a skype demo: flavio.trolese@4quant.com

Regards

Flavio