Dear All,
Sorry to post here, but I was not sure how best to get out to the correct community! I have used WEKA trainable segmentation a lot over the last few years, mainly for segmenting CT volumes of messy environmental materials, but recently, for single slice thin sections. It is an amazing piece of code, which has become central to my work. The problem I have had has only emerged recently, where the classifier model file has become huge, i.e. ~5 GB in size, even when the classifier is only fairly simple (just one or two filters, and around 20 labelled traces in each of three or four classes). Previously, the model files were perhaps up to a few hundred MB in size, with far more complex mixtures of filters, and much larger training sets (admittedly usually in 2 or 3 classes only). The effect has been to really slow down the segmentation process, even on single 8-bit tiffs only around 4MB in size. Can anyone explain why the classifier model file sizes have become so enormous, and if there is anything I can do to improve things? Cheers, Simon Dr Simon Carr School of Geography Queen Mary University of London Mile End, London, E1 4NS United Kingdom -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Dear Simon,
That's very odd. Which classifier are you using? I've just made a small test and my FastRandomForest classifier got saved in a file of a few Kbs. Can you send me a macro or script so i can reproduce the behavior you're experiencing? Also, on which operative system are you working? Best, ignacio On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]> wrote: > Dear All, > > Sorry to post here, but I was not sure how best to get out to the correct > community! > > I have used WEKA trainable segmentation a lot over the last few years, > mainly for segmenting CT volumes of messy environmental materials, but > recently, for single slice thin sections. It is an amazing piece of code, > which has become central to my work. > > The problem I have had has only emerged recently, where the classifier > model file has become huge, i.e. ~5 GB in size, even when the classifier is > only fairly simple (just one or two filters, and around 20 labelled traces > in each of three or four classes). Previously, the model files were perhaps > up to a few hundred MB in size, with far more complex mixtures of filters, > and much larger training sets (admittedly usually in 2 or 3 classes only). > > The effect has been to really slow down the segmentation process, even on > single 8-bit tiffs only around 4MB in size. > > Can anyone explain why the classifier model file sizes have become so > enormous, and if there is anything I can do to improve things? > > Cheers, > > Simon > > Dr Simon Carr > School of Geography > Queen Mary University of London > Mile End, London, E1 4NS > United Kingdom > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > -- Ignacio Arganda-Carreras, Ph.D. Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech Bâtiment 2 INRA Centre de Versailles-Grignon Route de St-Cyr (RD10) 78026 Versailles Cedex France Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19 Website: http://sites.google.com/site/iargandacarreras/ <http://biocomp.cnb.csic.es/~iarganda/index_EN.html> -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Hi Ignacio,
Sadly, I have not got a script or macro to send you (I’m not that advanced to know where to start!), but the segmentation model is based on 5 classes, in which there are around 10-20 traces (usually polygons defining sediment particles etc..) defined. I am using the FastRandomForestClassifier, with just a median training filter, and my model file is 660Mb in this instance. This runs okay (if a little slow), but if I edit or change the model by adding a few more training traces, the model jumps in size to 5 GB. It may be that the images the classifier is being applied to are large (RGB Tifs, often up to 35MB in size), but any thoughts would be much appreciated! I am running Weka Segmentation v.2.2.0, on two Macs (both running OSX10.10) Simon _________________________________ Dr Simon Carr School of Geography, Queen Mary University of London, Mile End Road, London, E1 4NS, UK. t: 00 44 20 7882 2780 f: 00 44 20 8981 6276 e: [hidden email]<mailto:[hidden email]> twitter: @DrSimonCarr On 30 Apr 2015, at 11:38, Ignacio Arganda-Carreras <[hidden email]<mailto:[hidden email]>> wrote: Dear Simon, That's very odd. Which classifier are you using? I've just made a small test and my FastRandomForest classifier got saved in a file of a few Kbs. Can you send me a macro or script so i can reproduce the behavior you're experiencing? Also, on which operative system are you working? Best, ignacio On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]<mailto:[hidden email]>> wrote: Dear All, Sorry to post here, but I was not sure how best to get out to the correct community! I have used WEKA trainable segmentation a lot over the last few years, mainly for segmenting CT volumes of messy environmental materials, but recently, for single slice thin sections. It is an amazing piece of code, which has become central to my work. The problem I have had has only emerged recently, where the classifier model file has become huge, i.e. ~5 GB in size, even when the classifier is only fairly simple (just one or two filters, and around 20 labelled traces in each of three or four classes). Previously, the model files were perhaps up to a few hundred MB in size, with far more complex mixtures of filters, and much larger training sets (admittedly usually in 2 or 3 classes only). The effect has been to really slow down the segmentation process, even on single 8-bit tiffs only around 4MB in size. Can anyone explain why the classifier model file sizes have become so enormous, and if there is anything I can do to improve things? Cheers, Simon Dr Simon Carr School of Geography Queen Mary University of London Mile End, London, E1 4NS United Kingdom -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- Ignacio Arganda-Carreras, Ph.D. Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech Bâtiment 2 INRA Centre de Versailles-Grignon Route de St-Cyr (RD10) 78026 Versailles Cedex France Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19 Website: http://sites.google.com/site/iargandacarreras/ <http://biocomp.cnb.csic.es/~iarganda/index_EN.html> -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
I see, how many samples are you using?
Can you send me an input image so I can play with it and some screenshots so i see the classes you're using? Thanks! On Thu, Apr 30, 2015 at 12:53 PM, Simon Carr <[hidden email]> wrote: > Hi Ignacio, > > Sadly, I have not got a script or macro to send you (I’m not that advanced > to know where to start!), but the segmentation model is based on 5 classes, > in which there are around 10-20 traces (usually polygons defining sediment > particles etc..) defined. I am using the FastRandomForestClassifier, with > just a median training filter, and my model file is 660Mb in this instance. > This runs okay (if a little slow), but if I edit or change the model by > adding a few more training traces, the model jumps in size to 5 GB. It may > be that the images the classifier is being applied to are large (RGB Tifs, > often up to 35MB in size), but any thoughts would be much appreciated! > > I am running Weka Segmentation v.2.2.0, on two Macs (both running OSX10.10) > > Simon > _________________________________ > Dr Simon Carr > School of Geography, > Queen Mary University of London, > Mile End Road, London, E1 4NS, UK. > t: 00 44 20 7882 2780 > f: 00 44 20 8981 6276 > e: [hidden email]<mailto:[hidden email]> > twitter: @DrSimonCarr > > > > > > > > On 30 Apr 2015, at 11:38, Ignacio Arganda-Carreras < > [hidden email]<mailto:[hidden email]>> wrote: > > Dear Simon, > > That's very odd. Which classifier are you using? I've just made a small > test and my FastRandomForest classifier got saved in a file of a few Kbs. > > Can you send me a macro or script so i can reproduce the behavior you're > experiencing? > > Also, on which operative system are you working? > > Best, > > ignacio > > On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]<mailto: > [hidden email]>> wrote: > > Dear All, > > Sorry to post here, but I was not sure how best to get out to the correct > community! > > I have used WEKA trainable segmentation a lot over the last few years, > mainly for segmenting CT volumes of messy environmental materials, but > recently, for single slice thin sections. It is an amazing piece of code, > which has become central to my work. > > The problem I have had has only emerged recently, where the classifier > model file has become huge, i.e. ~5 GB in size, even when the classifier is > only fairly simple (just one or two filters, and around 20 labelled traces > in each of three or four classes). Previously, the model files were perhaps > up to a few hundred MB in size, with far more complex mixtures of filters, > and much larger training sets (admittedly usually in 2 or 3 classes only). > > The effect has been to really slow down the segmentation process, even on > single 8-bit tiffs only around 4MB in size. > > Can anyone explain why the classifier model file sizes have become so > enormous, and if there is anything I can do to improve things? > > Cheers, > > Simon > > Dr Simon Carr > School of Geography > Queen Mary University of London > Mile End, London, E1 4NS > United Kingdom > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > > > > -- > Ignacio Arganda-Carreras, Ph.D. > Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech > Bâtiment 2 > INRA Centre de Versailles-Grignon > Route de St-Cyr (RD10) > 78026 Versailles Cedex France > > Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19 > Website: http://sites.google.com/site/iargandacarreras/ > <http://biocomp.cnb.csic.es/~iarganda/index_EN.html> > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html > -- Ignacio Arganda-Carreras, Ph.D. Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech Bâtiment 2 INRA Centre de Versailles-Grignon Route de St-Cyr (RD10) 78026 Versailles Cedex France Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19 Website: http://sites.google.com/site/iargandacarreras/ <http://biocomp.cnb.csic.es/~iarganda/index_EN.html> -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Hi Ignacio,
I will do this after the weekend, if I may: I will put all the relevant files on a NAS drive, and will contact you directly to give you access to the folder. However, I have a rush job to complete today and tomorrow before I can tinker with WEKA! Regards, Simon _________________________________ Dr Simon Carr School of Geography, Queen Mary University of London, Mile End Road, London, E1 4NS, UK. t: 00 44 20 7882 2780 f: 00 44 20 8981 6276 e: [hidden email]<mailto:[hidden email]> twitter: @DrSimonCarr On 30 Apr 2015, at 13:18, Ignacio Arganda-Carreras <[hidden email]<mailto:[hidden email]>> wrote: I see, how many samples are you using? Can you send me an input image so I can play with it and some screenshots so i see the classes you're using? Thanks! On Thu, Apr 30, 2015 at 12:53 PM, Simon Carr <[hidden email]<mailto:[hidden email]>> wrote: Hi Ignacio, Sadly, I have not got a script or macro to send you (I’m not that advanced to know where to start!), but the segmentation model is based on 5 classes, in which there are around 10-20 traces (usually polygons defining sediment particles etc..) defined. I am using the FastRandomForestClassifier, with just a median training filter, and my model file is 660Mb in this instance. This runs okay (if a little slow), but if I edit or change the model by adding a few more training traces, the model jumps in size to 5 GB. It may be that the images the classifier is being applied to are large (RGB Tifs, often up to 35MB in size), but any thoughts would be much appreciated! I am running Weka Segmentation v.2.2.0, on two Macs (both running OSX10.10) Simon _________________________________ Dr Simon Carr School of Geography, Queen Mary University of London, Mile End Road, London, E1 4NS, UK. t: 00 44 20 7882 2780 f: 00 44 20 8981 6276 e: [hidden email]<mailto:[hidden email]><mailto:[hidden email]> twitter: @DrSimonCarr On 30 Apr 2015, at 11:38, Ignacio Arganda-Carreras < [hidden email]<mailto:[hidden email]><mailto:[hidden email]>> wrote: Dear Simon, That's very odd. Which classifier are you using? I've just made a small test and my FastRandomForest classifier got saved in a file of a few Kbs. Can you send me a macro or script so i can reproduce the behavior you're experiencing? Also, on which operative system are you working? Best, ignacio On Wed, Apr 29, 2015 at 9:58 PM, Simon Carr <[hidden email]<mailto:[hidden email]><mailto: [hidden email]<mailto:[hidden email]>>> wrote: Dear All, Sorry to post here, but I was not sure how best to get out to the correct community! I have used WEKA trainable segmentation a lot over the last few years, mainly for segmenting CT volumes of messy environmental materials, but recently, for single slice thin sections. It is an amazing piece of code, which has become central to my work. The problem I have had has only emerged recently, where the classifier model file has become huge, i.e. ~5 GB in size, even when the classifier is only fairly simple (just one or two filters, and around 20 labelled traces in each of three or four classes). Previously, the model files were perhaps up to a few hundred MB in size, with far more complex mixtures of filters, and much larger training sets (admittedly usually in 2 or 3 classes only). The effect has been to really slow down the segmentation process, even on single 8-bit tiffs only around 4MB in size. Can anyone explain why the classifier model file sizes have become so enormous, and if there is anything I can do to improve things? Cheers, Simon Dr Simon Carr School of Geography Queen Mary University of London Mile End, London, E1 4NS United Kingdom -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- Ignacio Arganda-Carreras, Ph.D. Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech Bâtiment 2 INRA Centre de Versailles-Grignon Route de St-Cyr (RD10) 78026 Versailles Cedex France Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19 Website: http://sites.google.com/site/iargandacarreras/ <http://biocomp.cnb.csic.es/~iarganda/index_EN.html> -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- Ignacio Arganda-Carreras, Ph.D. Institut Jean-Pierre Bourgin, UMR1318 INRA-AgroParisTech Bâtiment 2 INRA Centre de Versailles-Grignon Route de St-Cyr (RD10) 78026 Versailles Cedex France Tel : +33 (0)1 30 83 30 00 - fax : +33 (0)1 30 83 33 19 Website: http://sites.google.com/site/iargandacarreras/ <http://biocomp.cnb.csic.es/~iarganda/index_EN.html> -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Free forum by Nabble | Edit this page |