Login  Register

Re: Creating DataSet after building classification with Trainable Weka Segmentation

Posted by Ignacio Arganda-Carreras on Aug 26, 2014; 3:42pm
URL: http://imagej.273.s1.nabble.com/Creating-DataSet-after-building-classification-with-Trainable-Weka-Segmentation-tp5009283p5009355.html

Hello again, Marcelo,


> 4. trace some pixels for all classes you have (you must have exactly the
> same classes as before)
>
> As I have to extract the same features. So i need to LOAD the
> data-train.arff again, right?
> I load the data-train.arff,  then I start to trace some pixels and in each
> images and classed it as A or B after the trace.
>

That's not correct. The ARFF file contains the feature values only for the
training traces. The features you use each time you call the plugin are
selected in the Settings menu. If you don't change them at all, the default
values are used (I think you are in this situation).



>
>
> 5. "save data" as "data-test.arff"
>
> After trace all 10 images and make classification with the trace, I press
> save data and save as data-test.arff
>
> Then the log is :
> Creating feature stack...
> Updating features of slice number 1...
> Updating features of slice number 2...
> Updating features of slice number 3...
> Updating features of slice number 4...
> Updating features of slice number 5...
> Updating features of slice number 6...
> Updating features of slice number 7...
> Updating features of slice number 8...
> Updating features of slice number 9...
> Updating features of slice number 10...
> Feature stack is now updated.
> Training input:
> # of pixels selected as class 1: 1772
> # of pixels selected as class 2: 593
> Merging data...
> Finished: total number of instances = 7232
> Writing training data: 7232 instances...
>
> It created 7232 instance .....what i expect is 10 instance with 119
> attributes.
>
>
OK, I see why you're confused. The number of instances is the *number of
pixels you traced not the number of input images*. In this case, the number
of training pixels (4867) plus the new test pixels you traced on the 10
test images.



>
> 6. click on the Weka button
>
> Anyway, I still continue and try. I click on the weka button to load weka.
>
> 7. launch the Weka explorer
>
> launch the explorer.
>
>
> 8. in the explorer, open the training arff file
>
> open my data-trian.arff
>
> 9. click on classify and choose the classifier you want (the equivalent to
> the plugin is a hr.irb.fastRandomForest.FastRandomForest with numTrees =
> 200 trees and numFeatures = 2).
>
> done.
>
> 10. select as test the data-test.arff you saved and run the classification
>
> the "Supplied test set", i choose my data-test.arff, and the outcome is :
>
> === Run information ===
>
> Scheme:       weka.classifiers.trees.RandomForest -I 200 -K 2 -S 1
> -num-slots 1
> Relation:     segment
> Instances:    4867
> Attributes:   119
>               [list of attributes omitted]
> Test mode:    user supplied test set:  size unknown (reading incrementally)
>
> === Classifier model (full training set) ===
>
> Random forest of 200 trees, each constructed while considering 2 random
> features.
> Out of bag error: 0.0643
>
>
>
> Time taken to build model: 4.98 seconds
>
> === Evaluation on test set ===
> === Summary ===
>
> Correctly Classified Instances        6707               92.7406 %
> Incorrectly Classified Instances       525                7.2594 %
> Kappa statistic                          0.8363
> Mean absolute error                      0.1242
> Root mean squared error                  0.2294
> Relative absolute error                 26.4056 %
> Root relative squared error             47.8019 %
> Coverage of cases (0.95 level)          99.8202 %
> Mean rel. region size (0.95 level)      73.4098 %
> Total Number of Instances             7232
>
> === Detailed Accuracy By Class ===
>
>                TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area
> PRC Area  Class
>                  0.976     0.162      0.918     0.976     0.946      0.978
> 0.986    class 1
>                  0.838     0.024      0.949     0.838     0.89       0.978
> 0.967    class 2
> Weighted Avg.    0.927     0.114      0.929     0.927     0.926      0.978
> 0.98
>
> === Confusion Matrix ===
>
>     a    b   <-- classified as
>  4579  114 |    a = class 1
>   411 2128 |    b = class 2
>
>
By doing this, you're using 4867 samples for training and 7232 for testing,
but those 7232 include the 4867, which is not correct.


>
>
> The question is, is the above step correct ? Coz i want to have 10
> instances
> to be test to see rather it is class 1 or class 2, but now I don't know how
> to interpret the result.
>

You need to understand that you're classifying pixels not whole images. So
the number of test instances is not 10, but the number of pixels you traced
on them. Does it make sense, now?

ignacio


--
Ignacio Arganda-Carreras, Ph.D.
Seung's lab, 46-5065
Department of Brain and Cognitive Sciences
Massachusetts Institute of Technology
43 Vassar St.
Cambridge, MA 02139
USA

Phone: (001) 617-324-3747
Website: http://bioweb.cnb.csic.es/~iarganda/index_EN.html

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html