ImageJ

WEKA trainable segmentation

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

Simon_Carr

Apr 29, 2015; 7:58pm

WEKA trainable segmentation

Dear All,

Sorry to post here, but I was not sure how best to get out to the correct community!

I have used WEKA trainable segmentation a lot over the last few years, mainly for segmenting CT volumes of messy environmental materials, but recently, for single slice thin sections. It is an amazing piece of code, which has become central to my work.

The problem I have had has only emerged recently, where the classifier model file has become huge, i.e. ~5 GB in size, even when the classifier is only fairly simple (just one or two filters, and around 20 labelled traces in each of three or four classes). Previously, the model files were perhaps up to a few hundred MB in size, with far more complex mixtures of filters, and much larger training sets (admittedly usually in 2 or 3 classes only).

The effect has been to really slow down the segmentation process, even on single 8-bit tiffs only around 4MB in size.

Can anyone explain why the classifier model file sizes have become so enormous, and if there is anything I can do to improve things?

Cheers,

Simon

Dr Simon Carr
School of Geography
Queen Mary University of London
Mile End, London, E1 4NS
United Kingdom
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

Ignacio Arganda-Carreras-2

Apr 30, 2015; 10:38am

Re: WEKA trainable segmentation

Dear Simon,

That's very odd. Which classifier are you using? I've just made a small
test and my FastRandomForest classifier got saved in a file of a few Kbs.

Can you send me a macro or script so i can reproduce the behavior you're
experiencing?

Also, on which operative system are you working?

Best,

ignacio

On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]> wrote:

> Dear All,
>
> Sorry to post here, but I was not sure how best to get out to the correct
> community!
>
> I have used WEKA trainable segmentation a lot over the last few years,
> mainly for segmenting CT volumes of messy environmental materials, but
> recently, for single slice thin sections. It is an amazing piece of code,
> which has become central to my work.
>
> The problem I have had has only emerged recently, where the classifier
> model file has become huge, i.e. ~5 GB in size, even when the classifier is
> only fairly simple (just one or two filters, and around 20 labelled traces
> in each of three or four classes). Previously, the model files were perhaps
> up to a few hundred MB in size, with far more complex mixtures of filters,
> and much larger training sets (admittedly usually in 2 or 3 classes only).
>
> The effect has been to really slow down the segmentation process, even on
> single 8-bit tiffs only around 4MB in size.
>
> Can anyone explain why the classifier model file sizes have become so
> enormous, and if there is anything I can do to improve things?
>
> Cheers,
>
> Simon
>
> Dr Simon Carr
> School of Geography
> Queen Mary University of London
> Mile End, London, E1 4NS
> United Kingdom
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>

... [show rest of quote]

--
Ignacio Arganda-Carreras, Ph.D.
Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech
Bâtiment 2
INRA Centre de Versailles-Grignon
Route de St-Cyr (RD10)
78026 Versailles Cedex France

Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19
Website: http://sites.google.com/site/iargandacarreras/
<http://biocomp.cnb.csic.es/~iarganda/index_EN.html>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

Simon_Carr

Apr 30, 2015; 10:52am

Re: WEKA trainable segmentation

Hi Ignacio,

Sadly, I have not got a script or macro to send you (I’m not that advanced to know where to start!), but the segmentation model is based on 5 classes, in which there are around 10-20 traces (usually polygons defining sediment particles etc..) defined. I am using the FastRandomForestClassifier, with just a median training filter, and my model file is 660Mb in this instance. This runs okay (if a little slow), but if I edit or change the model by adding a few more training traces, the model jumps in size to 5 GB. It may be that the images the classifier is being applied to are large (RGB Tifs, often up to 35MB in size), but any thoughts would be much appreciated!

I am running Weka Segmentation v.2.2.0, on two Macs (both running OSX10.10)

Simon
_________________________________
Dr Simon Carr
School of Geography,
Queen Mary University of London,
Mile End Road, London, E1 4NS, UK.
t: 00 44 20 7882 2780
f: 00 44 20 8981 6276
e: [hidden email]<mailto:[hidden email]>
twitter: @DrSimonCarr

On 30 Apr 2015, at 11:38, Ignacio Arganda-Carreras <[hidden email]<mailto:[hidden email]>> wrote:

Dear Simon,

That's very odd. Which classifier are you using? I've just made a small
test and my FastRandomForest classifier got saved in a file of a few Kbs.

Can you send me a macro or script so i can reproduce the behavior you're
experiencing?

Also, on which operative system are you working?

Best,

ignacio

On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]<mailto:[hidden email]>> wrote:

Dear All,

Sorry to post here, but I was not sure how best to get out to the correct
community!

I have used WEKA trainable segmentation a lot over the last few years,
mainly for segmenting CT volumes of messy environmental materials, but
recently, for single slice thin sections. It is an amazing piece of code,
which has become central to my work.

The problem I have had has only emerged recently, where the classifier
model file has become huge, i.e. ~5 GB in size, even when the classifier is
only fairly simple (just one or two filters, and around 20 labelled traces
in each of three or four classes). Previously, the model files were perhaps
up to a few hundred MB in size, with far more complex mixtures of filters,
and much larger training sets (admittedly usually in 2 or 3 classes only).

The effect has been to really slow down the segmentation process, even on
single 8-bit tiffs only around 4MB in size.

Can anyone explain why the classifier model file sizes have become so
enormous, and if there is anything I can do to improve things?

Cheers,

Simon

Dr Simon Carr
School of Geography
Queen Mary University of London
Mile End, London, E1 4NS
United Kingdom
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
Ignacio Arganda-Carreras, Ph.D.
Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech
Bâtiment 2
INRA Centre de Versailles-Grignon
Route de St-Cyr (RD10)
78026 Versailles Cedex France

Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19
Website: http://sites.google.com/site/iargandacarreras/
<http://biocomp.cnb.csic.es/~iarganda/index_EN.html>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

Ignacio Arganda-Carreras-2

Apr 30, 2015; 12:16pm

Re: WEKA trainable segmentation

I see, how many samples are you using?

Can you send me an input image so I can play with it and some screenshots
so i see the classes you're using?

Thanks!

On Thu, Apr 30, 2015 at 12:53 PM, Simon Carr <[hidden email]> wrote:

> Hi Ignacio,
>
> Sadly, I have not got a script or macro to send you (I’m not that advanced
> to know where to start!), but the segmentation model is based on 5 classes,
> in which there are around 10-20 traces (usually polygons defining sediment
> particles etc..) defined. I am using the FastRandomForestClassifier, with
> just a median training filter, and my model file is 660Mb in this instance.
> This runs okay (if a little slow), but if I edit or change the model by
> adding a few more training traces, the model jumps in size to 5 GB. It may
> be that the images the classifier is being applied to are large (RGB Tifs,
> often up to 35MB in size), but any thoughts would be much appreciated!
>
> I am running Weka Segmentation v.2.2.0, on two Macs (both running OSX10.10)
>
> Simon
> _________________________________
> Dr Simon Carr
> School of Geography,
> Queen Mary University of London,
> Mile End Road, London, E1 4NS, UK.
> t: 00 44 20 7882 2780
> f: 00 44 20 8981 6276
> e: [hidden email]<mailto:[hidden email]>
> twitter: @DrSimonCarr
>
>
>
>
>
>
>
> On 30 Apr 2015, at 11:38, Ignacio Arganda-Carreras <
> [hidden email]<mailto:[hidden email]>> wrote:
>
> Dear Simon,
>
> That's very odd. Which classifier are you using? I've just made a small
> test and my FastRandomForest classifier got saved in a file of a few Kbs.
>
> Can you send me a macro or script so i can reproduce the behavior you're
> experiencing?
>
> Also, on which operative system are you working?
>
> Best,
>
> ignacio
>
> On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]<mailto:
> [hidden email]>> wrote:
>
> Dear All,
>
> Sorry to post here, but I was not sure how best to get out to the correct
> community!
>
> I have used WEKA trainable segmentation a lot over the last few years,
> mainly for segmenting CT volumes of messy environmental materials, but
> recently, for single slice thin sections. It is an amazing piece of code,
> which has become central to my work.
>
> The problem I have had has only emerged recently, where the classifier
> model file has become huge, i.e. ~5 GB in size, even when the classifier is
> only fairly simple (just one or two filters, and around 20 labelled traces
> in each of three or four classes). Previously, the model files were perhaps
> up to a few hundred MB in size, with far more complex mixtures of filters,
> and much larger training sets (admittedly usually in 2 or 3 classes only).
>
> The effect has been to really slow down the segmentation process, even on
> single 8-bit tiffs only around 4MB in size.
>
> Can anyone explain why the classifier model file sizes have become so
> enormous, and if there is anything I can do to improve things?
>
> Cheers,
>
> Simon
>
> Dr Simon Carr
> School of Geography
> Queen Mary University of London
> Mile End, London, E1 4NS
> United Kingdom
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>
>
>
>
> --
> Ignacio Arganda-Carreras, Ph.D.
> Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech
> Bâtiment 2
> INRA Centre de Versailles-Grignon
> Route de St-Cyr (RD10)
> 78026 Versailles Cedex France
>
> Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19
> Website: http://sites.google.com/site/iargandacarreras/
> <http://biocomp.cnb.csic.es/~iarganda/index_EN.html>
>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>
>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>

... [show rest of quote]

Simon_Carr

Apr 30, 2015; 12:41pm

Re: WEKA trainable segmentation

Hi Ignacio,

I will do this after the weekend, if I may: I will put all the relevant files on a NAS drive, and will contact you directly to give you access to the folder. However, I have a rush job to complete today and tomorrow before I can tinker with WEKA!

Regards,

Simon
_________________________________
Dr Simon Carr
School of Geography,
Queen Mary University of London,
Mile End Road, London, E1 4NS, UK.
t: 00 44 20 7882 2780
f: 00 44 20 8981 6276
e: [hidden email]<mailto:[hidden email]>
twitter: @DrSimonCarr

On 30 Apr 2015, at 13:18, Ignacio Arganda-Carreras <[hidden email]<mailto:[hidden email]>> wrote:

I see, how many samples are you using?

Can you send me an input image so I can play with it and some screenshots
so i see the classes you're using?

Thanks!

On Thu, Apr 30, 2015 at 12:53 PM, Simon Carr <[hidden email]<mailto:[hidden email]>> wrote:

Hi Ignacio,

Sadly, I have not got a script or macro to send you (I’m not that advanced
to know where to start!), but the segmentation model is based on 5 classes,
in which there are around 10-20 traces (usually polygons defining sediment
particles etc..) defined. I am using the FastRandomForestClassifier, with
just a median training filter, and my model file is 660Mb in this instance.
This runs okay (if a little slow), but if I edit or change the model by
adding a few more training traces, the model jumps in size to 5 GB. It may
be that the images the classifier is being applied to are large (RGB Tifs,
often up to 35MB in size), but any thoughts would be much appreciated!

I am running Weka Segmentation v.2.2.0, on two Macs (both running OSX10.10)

Simon
_________________________________
Dr Simon Carr
School of Geography,
Queen Mary University of London,
Mile End Road, London, E1 4NS, UK.
t: 00 44 20 7882 2780
f: 00 44 20 8981 6276
e: [hidden email]<mailto:[hidden email]><mailto:[hidden email]>
twitter: @DrSimonCarr

On 30 Apr 2015, at 11:38, Ignacio Arganda-Carreras <
[hidden email]<mailto:[hidden email]><mailto:[hidden email]>> wrote:

Dear Simon,

That's very odd. Which classifier are you using? I've just made a small
test and my FastRandomForest classifier got saved in a file of a few Kbs.

Can you send me a macro or script so i can reproduce the behavior you're
experiencing?

Also, on which operative system are you working?

Best,

ignacio

On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]<mailto:[hidden email]><mailto:
[hidden email]<mailto:[hidden email]>>> wrote:

Dear All,

Sorry to post here, but I was not sure how best to get out to the correct
community!

I have used WEKA trainable segmentation a lot over the last few years,
mainly for segmenting CT volumes of messy environmental materials, but
recently, for single slice thin sections. It is an amazing piece of code,
which has become central to my work.

The problem I have had has only emerged recently, where the classifier
model file has become huge, i.e. ~5 GB in size, even when the classifier is
only fairly simple (just one or two filters, and around 20 labelled traces
in each of three or four classes). Previously, the model files were perhaps
up to a few hundred MB in size, with far more complex mixtures of filters,
and much larger training sets (admittedly usually in 2 or 3 classes only).

The effect has been to really slow down the segmentation process, even on
single 8-bit tiffs only around 4MB in size.

Can anyone explain why the classifier model file sizes have become so
enormous, and if there is anything I can do to improve things?

Cheers,

Simon

Dr Simon Carr
School of Geography
Queen Mary University of London
Mile End, London, E1 4NS
United Kingdom
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
Ignacio Arganda-Carreras, Ph.D.
Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech
Bâtiment 2
INRA Centre de Versailles-Grignon
Route de St-Cyr (RD10)
78026 Versailles Cedex France

Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19
Website: http://sites.google.com/site/iargandacarreras/
<http://biocomp.cnb.csic.es/~iarganda/index_EN.html>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
Ignacio Arganda-Carreras, Ph.D.
Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech
Bâtiment 2
INRA Centre de Versailles-Grignon
Route de St-Cyr (RD10)
78026 Versailles Cedex France

Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19
Website: http://sites.google.com/site/iargandacarreras/
<http://biocomp.cnb.csic.es/~iarganda/index_EN.html>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html