Login  Register

Re: Need advice about Principal Component Analysis (PCA)

Posted by Michael Schmid on Jun 26, 2007; 8:20pm
URL: http://imagej.273.s1.nabble.com/Need-advice-about-Principal-Component-Analysis-PCA-tp3698972p3698975.html

Hi Jim,

answers to part of your questions:

Eigenvectors - you can multiply each of them individually with
any scalar you want. If A is a matrix, V an eigenvector and e
the eigenvalue,
   A V = e V
then A (cV) = e (cV)
where c is any constant.

Eigenvalues - these cannot be multiplied by any value.
   http://en.wikipedia.org/wiki/Eigenvalue%2C_eigenvector_and_eigenspace

PCA packages usually find the largest eigenvalue first, then the
others in sequence of decreasing value. Different results for
different algorithms may arise from different normalization or
pre-processing of the data.

Calculations should be done after subtracting the mean (i.e.,
the average intensity of each channel).

For RGB images, cases where principal component analysis works
well will be seen when rotating the 3D histogram and all data
lie essentially in one plane (2 principal components) or along
a line (1 principal component).
   http://rsb.info.nih.gov/ij/plugins/color-inspector.html

Michael
________________________________________________________________

On 26 Jun 2007, at 18:26, Jim Cant wrote:

> Hi,
>
> I hava a few questions (below) about Principal Component Analysis  
> (PCA)
> which
> I am hoping someone will help me with.  I ask this because I'm  
> trying two
> PCA packages* and they give different results.  My fear is that I
> know just enough to be dangerous.
>
> I haven't been able to find answere either on-line or in any of the
> linear algegra books in our library.
>
> Thanks for your help; it is greatly appreciated.
>
> Jim Cant
>
>
> I apologize for the length of the questions; I opted for clarity  
> rather than
> brevity.
>
> 1.  Under what conditions are 2 sets of eigenvectors and associated
>     eigenvalues considered equal?
>
>     My hunch is that
>         1. If all corresponding eigenvectors are the same scalar
>            multiple of each other.
>         AND
>         2. If the ratio of corresponding eigenvalues from each set is
>            the same, i.e. are scalar multiples of each other
>         THEN
>         The results are equivalent.
>
>     1b.  What if #1 is relaxed to say that each pair of corresponding
>     eigenvector are scalar multiples but the multiplier differes
>     for each pair?
>
>     1c.  What if the multplier is the same for all pairs but sometimes
>     differs in sign?
>
> 2.  When calculating the covariance matrix, does one use the
>     deviation of each observations from the mean of all
>     observations for the feature or the mean of all observations
>     over all features.  From what I read, the first is the correct
>     approach but these two packages seem to differ.
>
> 3.  Does the order of the calculated eigenvectors have any
>     significance?.  It seems they are often returned sorted by
>     eigenvalue.  I ask because in my data, each feature is an image
>     taken at a paticular time interval after an perturbation giving
>     the data an inherent ordering.  I'm concerned that if I consider
>     the data after sorting, that it may be difficult to 'attribute' an
>     eigenvector to a particular underlying cause (if the sort order
>     changes).
>
> 4.  Can anyone point me to some data with the results of eigenvalue
>     analysis for the data?  This would help a lot in testing.  Even
>     better, is there a way to programatically generate test data where
>     the eigenvectors/values are known?
>
> 5.  Are there any other packages to do PCA that you'd recommend?
>
>
> *  The two packages are
>        JAMA from NIST (http://math.nist.gov/javanumerics/jama/)
>    and
>        BIJ, Bio-medical Imaging in Java (http://bij.isi.uu.nl/)
>
>    I can only get the BIJ to agree with the JAMA if the raw data has
>    only 2 features and the mean of the observations is 0 (before
>    analysis.)  Looking at the BIJ source code, it appears that when
>    calculating the covariance matrix, the deviations are taken with
>    respect to the mean of all observations.  (Also, the calculation  
> of the
>    mean itself appears suspect.)