Stitcher for Large Files

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Stitcher for Large Files

Olivier Burri
Hello all,
On behalf of the EPFL PTBIOP, I am providing a link here to a stitcher we've assembled for stitching large image grids. It's not perfect and could definitely use the community's input.

The goal is to be able to stitch very large datasets using plugins already found in ImageJ / FIji such as TurboReg, Stephan Preibisch's Stitcher and LOCI formats.

JAR File
http://documents.epfl.ch/users/o/ob/oburri/public/massive/Massive_Stitcher_.jar

Source
http://documents.epfl.ch/users/o/ob/oburri/public/massive/Massive_Stitcher_src.zip

Documentation
https://documents.epfl.ch/users/o/ob/oburri/public/massive/Massive%20Stitcher.pptx

We would like it very much if you would provide some feedback and bugs so that we can go about improving this plugin. We know there's still work to do in order to make it work properly and are looking forward to your comments!

Best

Oli
Reply | Threaded
Open this post in threaded view
|

Re: Stitcher for Large Files

vtali
Hi Olivier,

I think this is great development and I have a suggestion to consider:

The big issue in whole slide processing is managing memory and other computer resources, since single slide layer, even at 40x, can make 10-15GB of data.

The solution here is virtualization. Full slide is never assembled in memory as a single object, instead it “lives” as virtual array of tiles distributed across storage (data could) and this data array can be processed by multiple CPUs.
We have we software server platform and web applications for such virtualization and processing of multi-layered images of infinite size on dedicated servers or Amazon Compute Cloud - Simagis Live (www.live.simagis.com ). Users can open free account on public application server and stream image tiles to server for stitching, viewing, sharing, and processing, including running ImageJ macros on them.

So, If you can make output of your software in a compatible format – array of image tiles + XYZPositions.txt (example: http://db.tt/nIy3UU2 ), than anybody can get Turbo Upload utility (example: http://live.simagis.com/wiki/-/wiki/Main/Upload+Objective+Imaging+Workspace )  and stream  final tiles to server for visualization, sharing, processing etc.

It seems from your PPT that your software architecture already supports virtualization (you handle individual tiles and update XYZ positions in the common coordinate system). So if you can separate tile processing from visualization and implement virtualization option, [input tile structure]->[output tile structure], without building switched object in memory, you can make your app scalable and suitable for the Cloud deployment (make stitching recipe locally, run on server-side or computing grid) .

Also, we are looking for organization to host community server (we are running out of space for free accounts on application servers). If PTBIOP is interested, we can discuss putting server on-site for you as long as you provide some level of free service to ImageJ community.  

Feel free to contact me privately to discuss details, we are open to collaboration.

Vitali Khvatkov
Smart Imaging Technologies Co.
+1-713-589-3500
www.live.simagis.com
Reply | Threaded
Open this post in threaded view
|

Re: Stitcher for Large Files

dscho
Hi Vitali,

On Thu, 14 Jul 2011, Vitali wrote:

> The big issue in whole slide processing is managing memory and other
> computer resources, since single slide layer, even at 40x, can make
> 10-15GB of data.

It is great that Stephan Preibisch' Stitching plugins cope with this quite
well by reducing the amount of data to be processed before doing the
actual number crunching. This, plus the fact that you can modify and
enhance the code (although you should cite the original paper, of course)
is very useful.

I am also pretty excited that the OMERO database (fully Open Source)
supports large slides quite efficiently, too:

        http://www.openmicroscopy.org/

Ciao,
Johannes
Reply | Threaded
Open this post in threaded view
|

Re: Stitcher for Large Files

Olivier Burri
In reply to this post by vtali
Hi Vitali,

>So, If you can make output of your software in a compatible format – array of image tiles + XYZPositions.txt (example: http://db.tt/nIy3UU2 ), than anybody can get Turbo Upload utility (example: http://live.simagis.com/wiki/-/wiki/Main/Upload+Objective+Imaging+Workspace )  and stream  final tiles to server for visualization, sharing, processing etc.

Thanks for your feedback, I'll see about putting in a functionality that breaks the image planes into equally sized tiles and will look at how this is commonly done (Your URL for the example was inaccessible, btw).


I guess the way to go at it, is to either directly use the registered coordinates from Stitch_Image_Collection, or to perform the stitching, then break down each plane into equally-sized, non-overlapping tiles by assembling the original sub-tiles sequentially, extracting the final non-overlapping tile, closing the unneccesary files and repeating the process until all is done.

Thing is, using Stephan Preibish's Sticher calls for the compromise of having the entire image plane loaded during stitching prior to saving it, because it's taking advantage of the higher-level thread-safe functions from ImageJ to populate the large image with the pixel data. So I guess this involves looking at a different way to assemble the tiles at the end of the optimization of their positions, which is incredibely fast.


 I'll look at Simagis Live, OMERO (Thanks Johanness), CATMAID and maybe others to see how they're working this out and what kind of data they take as input.

Best,

Oli
Reply | Threaded
Open this post in threaded view
|

Re: Stitcher for Large Files

vtali
In reply to this post by Olivier Burri
Hi,

" I guess the way to go at it, is to either directly use the registered coordinates from Stitch_Image_Collection, or to perform the stitching, then break down each plane into equally-sized, non-overlapping tiles by assembling the original sub-tiles sequentially, extracting the final non-overlapping tile, closing the unnecessary files and repeating the process until all is done. " - Correct.

If you can, try to avoid loading full object (layer / slide) into memory. Instead see if you van work with tile array as virtual image. You can process tiles (rotate, crop, change coordinates) to get correct registration in XYZ space. And then offer to options: visualize slide on client (if client has enough resources - memory, powerful graphics card etc.) or send to server like Simagis Live or OMERO for server-side processing, visualization , distributed analysis etc.

On Simagis Live we can parse any multi-dimentionial tile structure as long as we have tiles + registration coordinates, so we will make data connector for your structure once you have it ready.

Another word of caution:  We have found that current version of ImageJ leaks memory and is unstable when runs on large sets of images. So when we run ImageJ macros on Simagis Live servers as multi-threaded tile processor,  we have to watch ImagJ processes and kill them once they stall / hang / or blow too much memory. We had to build special ImageJ manager for that. I hope ImageJ2 will solve those issues that but for now beware: you may not be able to process say 1000 images in a single run of ImageJ. This is another argument for going tile-by-tile so you can pick up where it crashed.

Best,

Vitali Khvatkov
Smart Imaging Technologies Co.
+1 713-589-3500
www.live.simagis.com