Does anyone have experience distributing imageJ across multiple computers or
a cluster to process a very large number of images? Thanks, Sean |
2010/9/26 Sean, Founder CloudSpree <[hidden email]>:
> Does anyone have experience distributing imageJ across multiple computers or > a cluster to process a very large number of images? Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame buffer). At the end, run a closing job that pools all results of the individual jobs. Fiji by the way is able to launch ImageJ without requiring a virtual frame buffer (use --headless option). Albert -- http://albert.rierol.net |
Hello,
as far as I know there is no possibility for ImageJ to distribute processes over more than one node/computer. So you have to start ImageJ/Fiji on every node/computer separatly. Is that right? Greetings Fred On 26.09.2010 17:21, Albert Cardona wrote: > 2010/9/26 Sean, Founder CloudSpree<[hidden email]>: >> Does anyone have experience distributing imageJ across multiple computers or >> a cluster to process a very large number of images? > > Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame buffer). > At the end, run a closing job that pools all results of the individual jobs. > > Fiji by the way is able to launch ImageJ without requiring a virtual > frame buffer (use --headless option). > > Albert |
Is this something many imageJ users would be interested in?
I ask as I am contemplating developing a solution but have no desire to reinvent the wheel. Are many people waiting hours/days/weeks to batch process images? Thanks, Sean On Wed, Sep 29, 2010 at 5:47 AM, suendermann < [hidden email]> wrote: > Hello, > > as far as I know there is no possibility for ImageJ to distribute processes > over more than one node/computer. So you have to start ImageJ/Fiji on every > node/computer separatly. Is that right? > > Greetings > Fred > > > On 26.09.2010 17:21, Albert Cardona wrote: > >> 2010/9/26 Sean, Founder CloudSpree<[hidden email]>: >> >>> Does anyone have experience distributing imageJ across multiple computers >>> or >>> a cluster to process a very large number of images? >>> >> >> Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame >> buffer). >> At the end, run a closing job that pools all results of the individual >> jobs. >> >> Fiji by the way is able to launch ImageJ without requiring a virtual >> frame buffer (use --headless option). >> >> Albert >> > |
This sounds like a task for a Hadoop or Terracotta solution
Personally, I'd like to see something on these lines (over time we will need t batch process large quantities of images) On 9/29/10 4:47 PM, "Sean, Founder CloudSpree" <[hidden email]> wrote: > Is this something many imageJ users would be interested in? > > I ask as I am contemplating developing a solution but have no desire to > reinvent the wheel. > > Are many people waiting hours/days/weeks to batch process images? > > Thanks, > Sean > > On Wed, Sep 29, 2010 at 5:47 AM, suendermann < > [hidden email]> wrote: > >> Hello, >> >> as far as I know there is no possibility for ImageJ to distribute processes >> over more than one node/computer. So you have to start ImageJ/Fiji on every >> node/computer separatly. Is that right? >> >> Greetings >> Fred >> >> >> On 26.09.2010 17:21, Albert Cardona wrote: >> >>> 2010/9/26 Sean, Founder CloudSpree<[hidden email]>: >>> >>>> Does anyone have experience distributing imageJ across multiple computers >>>> or >>>> a cluster to process a very large number of images? >>>> >>> >>> Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame >>> buffer). >>> At the end, run a closing job that pools all results of the individual >>> jobs. >>> >>> Fiji by the way is able to launch ImageJ without requiring a virtual >>> frame buffer (use --headless option). >>> >>> Albert >>> >> -- Rajarshi Guha NIH Chemical Genomics Center |
In reply to this post by Sean, Founder CloudSpree
Hi,
On Wed, 29 Sep 2010, Sean, Founder CloudSpree wrote: > Is this something many imageJ users would be interested in? I would be interested. > I ask as I am contemplating developing a solution but have no desire to > reinvent the wheel. > > Are many people waiting hours/days/weeks to batch process images? The big problem is to formulate a simple yet powerful interface for this which is feasible. For example, you will never be able to parallelize in general a given macro. But you could provide a way to do the Process>Batch>Macro... in a truly parallel manner, say, on a cluster. However. The big problem is that ImageJ was designed as a desktop application, and therefore many plugins require a graphical desktop even if they do not display anything. For example, if you instantiate any java.awt.Dialog on Linux, you need a graphical desktop. We tried to work around this issue in the 'headless' hack in Fiji (basically providing fake Dialog and Menus classes), but fact is: many plugins will fail to run with that hack, because they access more functionality than we provide, e.g. registering callbacks such as constraining the aspect ratio when downsampling. And our fake Dialog class does not provide all required fake methods for that. Oh, and by the way, if anybody wants to throw about with buzzwords such as map/reduce, I would be interested in _concrete_ suggestions how that should work, rather than just names of implementations of the paradigm. I am interested in a concrete such suggestion, because the map/reduce paradigm and ImageJ's plugin infrastructure appear to be completely incompatible to me. But I have been wrong before. So: prove me wrong. Concretely, not with PR speak. Ciao, Johannes |
Hi,
I would be interested in too! > For example, you will never be able to parallelize in general a given > macro. But you could provide a way to do the Process>Batch>Macro... in a > truly parallel manner, say, on a cluster. Maybe I don't get the point, but what do you mean with cluster? Just one PC with several cpus or a bigger thing with many nodes each with its own cpus? We tried to get it run on a machine with 8 nodes with 8 CPUs each and had to find out, that ImageJ is not able to communicate over nodes. It stuck within the node and its cpus. So our only solution was to start it on every node separatly and use a script to distribute to "jobs" to the nodes. > However. The big problem is that ImageJ was designed as a desktop > application, and therefore many plugins require a graphical desktop even > if they do not display anything. For example, if you instantiate any > java.awt.Dialog on Linux, you need a graphical desktop. > > We tried to work around this issue in the 'headless' hack in Fiji > (basically providing fake Dialog and Menus classes), but fact is: many > plugins will fail to run with that hack, because they access more > functionality than we provide, e.g. registering callbacks such as > constraining the aspect ratio when downsampling. And our fake Dialog > class does not provide all required fake methods for that. Do you plan to improve your fake class or do you search for another solution? Greetings Fred |
Hi,
On Thu, 30 Sep 2010, suendermann wrote: > > For example, you will never be able to parallelize in general a given > > macro. But you could provide a way to do the Process>Batch>Macro... in > > a truly parallel manner, say, on a cluster. > > Maybe I don't get the point, but what do you mean with cluster? Except for some enterprisey marketing types's new definitions, a cluster is a well defined entity: a couple of computers with identical architecture and operating system (so-called nodes), controlled by a master node, which gets sent jobs using a so-called scheduler. > Just one PC with several cpus or a bigger thing with many nodes each > with its own cpus? We tried to get it run on a machine with 8 nodes with > 8 CPUs each and had to find out, that ImageJ is not able to communicate > over nodes. It stuck within the node and its cpus. I tried to get over the point in my previous mail: it is _not_ ImageJ! It is the _plugin_! ImageJ would be happy to run on as many cores as you let it. But ImageJ is inherently sequential, as it is designed as a desktop application (which I probably mentioned before). > So our only solution was to start it on every node separatly and use a > script to distribute to "jobs" to the nodes. If you thought that there would be an automatic way to distribute the work load of a sequential macro magically onto a number of cores, then I have to disappoint you. It is easy to construct pathological examples where each step of the macro depends on the previous step, and where each individual step is not parallelizable. There is simply no way to parallelize this thing automatically. > > However. The big problem is that ImageJ was designed as a desktop > > application, and therefore many plugins require a graphical desktop > > even if they do not display anything. For example, if you instantiate > > any java.awt.Dialog on Linux, you need a graphical desktop. > > > > We tried to work around this issue in the 'headless' hack in Fiji > > (basically providing fake Dialog and Menus classes), but fact is: many > > plugins will fail to run with that hack, because they access more > > functionality than we provide, e.g. registering callbacks such as > > constraining the aspect ratio when downsampling. And our fake Dialog > > class does not provide all required fake methods for that. > > Do you plan to improve your fake class or do you search for another > solution? I will improve the fake Dialog and Menus classes as I need to. I will also happily accept improvements others needed to do. But it is a hack, and every hack shows its limitations if you use it too hard. The proper solution is a redesign of the plugin mechanism to allow headless execution (this is underway as far as I understand), but of course this will only work with plugins written using the new interface. The old plugins will continue to have the problem. Hth, Johannes |
Hi,
> On Thu, 30 Sep 2010, suendermann wrote: > >>> For example, you will never be able to parallelize in general a given >>> macro. But you could provide a way to do the Process>Batch>Macro... in >>> a truly parallel manner, say, on a cluster. >> >> Maybe I don't get the point, but what do you mean with cluster? > > Except for some enterprisey marketing types's new definitions, a cluster > is a well defined entity: a couple of computers with identical > architecture and operating system (so-called nodes), controlled by a > master node, which gets sent jobs using a so-called scheduler. I'm aware what a cluster is. I just wanted to be sure we are speaking about the same thing and not about cloud or a thing like the sun grid-engine. >> Just one PC with several cpus or a bigger thing with many nodes each >> with its own cpus? We tried to get it run on a machine with 8 nodes with >> 8 CPUs each and had to find out, that ImageJ is not able to communicate >> over nodes. It stuck within the node and its cpus. > > I tried to get over the point in my previous mail: it is _not_ ImageJ! It > is the _plugin_! ImageJ would be happy to run on as many cores as you let > it. But ImageJ is inherently sequential, as it is designed as a desktop > application (which I probably mentioned before). Maybe I'm a little bit brain-blocked today, but eventually we don't speak about the same. Running ImageJ on many cores isn't a problem. I even don't want to parallelize a macro. I' interested in building a plugin that uses as many cores as possible, like the ExtendedPlugInFilter. At this point I didn't get it run on more than 8 cores (each of our nodes has 8 cores). If you did so, could you give me a hint? >> So our only solution was to start it on every node separatly and use a >> script to distribute to "jobs" to the nodes. > > If you thought that there would be an automatic way to distribute > the work load of a sequential macro magically onto a number of cores, then > I have to disappoint you. It is easy to construct pathological examples > where each step of the macro depends on the previous step, and where each > individual step is not parallelizable. There is simply no way to > parallelize this thing automatically. No, I don't like magic... I want to control such stuff. Maybe Fiji's jython interface is a starting point for that. It should be easy to include the functions/plugins you need in a python script and distribute that across your cluster. >>> However. The big problem is that ImageJ was designed as a desktop >>> application, and therefore many plugins require a graphical desktop >>> even if they do not display anything. For example, if you instantiate >>> any java.awt.Dialog on Linux, you need a graphical desktop. >>> >>> We tried to work around this issue in the 'headless' hack in Fiji >>> (basically providing fake Dialog and Menus classes), but fact is: many >>> plugins will fail to run with that hack, because they access more >>> functionality than we provide, e.g. registering callbacks such as >>> constraining the aspect ratio when downsampling. And our fake Dialog >>> class does not provide all required fake methods for that. >> >> Do you plan to improve your fake class or do you search for another >> solution? > > I will improve the fake Dialog and Menus classes as I need to. I will also > happily accept improvements others needed to do. But it is a hack, and > every hack shows its limitations if you use it too hard. > > The proper solution is a redesign of the plugin mechanism to allow > headless execution (this is underway as far as I understand) This sounds great! > but of > course this will only work with plugins written using the new interface. > The old plugins will continue to have the problem. > > Hth, > Johannes Greetings Fred |
In reply to this post by Sean, Founder CloudSpree
I know a lot of people who might be interested in something like this.
Especially multi user facilities which are involved in high throughput imaging. Two years ago I was facing the problem when I was working at the EMBL. Due to the mass of data (20 GB) it took about three day to analyze these data with ImageJ. In order to speed this process up and make it more stable I developed the following work around: I started ImageJ on upto six different computers and started plugin. This plugin was doing the following: Reading a textfile (which images still needs to be processed), opening the image, calling a makro to process the image, save the results and writing to the above mentioned file (which image it finished processing). As I said more a workaround. But it worked and images could be processed in less than a day. I'm happy to provide the code of the plugin if you are interested. Cheers Arne -----Original Message----- From: ImageJ Interest Group [mailto:[hidden email]] On Behalf Of Sean, Founder CloudSpree Sent: mercredi 29 septembre 2010 22:47 To: [hidden email] Subject: Re: imageJ + distributed computing Is this something many imageJ users would be interested in? I ask as I am contemplating developing a solution but have no desire to reinvent the wheel. Are many people waiting hours/days/weeks to batch process images? Thanks, Sean On Wed, Sep 29, 2010 at 5:47 AM, suendermann < [hidden email]> wrote: > Hello, > > as far as I know there is no possibility for ImageJ to distribute processes > over more than one node/computer. So you have to start ImageJ/Fiji on every > node/computer separatly. Is that right? > > Greetings > Fred > > > On 26.09.2010 17:21, Albert Cardona wrote: > >> 2010/9/26 Sean, Founder CloudSpree<[hidden email]>: >> >>> Does anyone have experience distributing imageJ across multiple computers >>> or >>> a cluster to process a very large number of images? >>> >> >> Yes: run ImageJ jobs in shell scripts that setup a Xvfb (virtual frame >> buffer). >> At the end, run a closing job that pools all results of the individual >> jobs. >> >> Fiji by the way is able to launch ImageJ without requiring a virtual >> frame buffer (use --headless option). >> >> Albert >> > |
On Fri, 2010-10-01 at 21:40 +0000, Seitz Arne wrote:
> This plugin was doing the following: Reading a textfile (which images > still needs to be processed), opening the image, calling a makro to > process the image, save the results and writing to the above mentioned > file (which image it finished processing). The problem with that approach, _if one is not carfull_, is syncrhonization. When 2 instances of ImageJ tries to access or worse save the file at the same time that can leave unprocessed files. Gabriel Lapointe, MSc. Laboratoire de Luc DesGroseillers, PhD. Pavillon Roger-Gaudry Local A-538 Département de biochimie Faculté de Médecine de l'Université de Montréal 2900 boul. Édouard-Montpetit, Montréal, Qc, H3T 1J4 Tel : (514) 343-6111 postes 5187, 5152, 5162 ou 1048 Fax : (514) 343-2210 [hidden email] http://gabriellapointe.ca |
In reply to this post by Sean, Founder CloudSpree
Hi! Could all the problems be solved, if building a system, where all the nodes of the cluster would be seen as one computer? This means running ImageJ/Fiji on one "virtual supercomputer" with all the resources of the nodes. I ask this because the generated data i work with is 3x-5x times bigger then my available memory.....I dont' care if it works days on one project, all i want is to be able to load my whole stack and work with my whole stack. Thx!
|
Hi,
You could use a Virtual Stack (File > Import > TIFF Virtual Stack) rather than a regular stack to solve the RAM problem. If your processing doesn't need to actually have all the slices data in memory at once, that could be a simple alternative ? Christophe On Fri, Mar 9, 2012 at 11:45, Mather <[hidden email]> wrote: > Hi! Could all the problems be solved, if building a system, where all the > nodes of the cluster would be seen as one computer? This means running > ImageJ/Fiji on one "virtual supercomputer" with all the resources of the > nodes. I ask this because the generated data i work with is 3x-5x times > bigger then my available memory.....I dont' care if it works days on one > project, all i want is to be able to load my whole stack and work with my > whole stack. Thx! > > -- > View this message in context: http://imagej.1557.n6.nabble.com/imageJ-distributed-computing-tp3686733p4562026.html > Sent from the ImageJ mailing list archive at Nabble.com. |
Thx, I found that using virtualstack i can't get the wanted result. For example: Applying a bandpass filter on the whole stack won't give me the same result as chopping up the stack and going through the parts with the filter. An image is about 4 mb and if i've got a 1000 of them, it is pretty time-consuming to work through the parts. Anyway I've got the same effects with other "commands" so I have to work with no virtualstack. I' ve got a few x86 P4 systems, that's why i was thinking about a "cluster"
|
Hi
My name is Flavio. I'm part of 4Quant. We do parallel, distributed, real-time image processing and quantitative image analysis. Leveraging the large existing communities built around ImageJ and FIJI, Cloud Image Processing (CIP) lets you scale up these tools from one machine to hundreds in a distributed, fault-tolerant manner. Whether with powerful workstations, high-performance clusters, or cloud-based machines, our framework can handle your problem and make analysis and configuration easy using any device with our web-based tools. Please feel free to contact me for a skype demo: flavio.trolese@4quant.com Regards Flavio |
Free forum by Nabble | Edit this page |