Login  Register

Need advice about Principal Component Analysis (PCA)

Posted by Jim Cant on Jun 26, 2007; 5:26pm
URL: http://imagej.273.s1.nabble.com/Need-advice-about-Principal-Component-Analysis-PCA-tp3698972.html

Hi,

I hava a few questions (below) about Principal Component Analysis (PCA) which
I am hoping someone will help me with.  I ask this because I'm trying two
PCA packages* and they give different results.  My fear is that I
know just enough to be dangerous.

I haven't been able to find answere either on-line or in any of the
linear algegra books in our library.

Thanks for your help; it is greatly appreciated.

Jim Cant


I apologize for the length of the questions; I opted for clarity rather than brevity.

1.  Under what conditions are 2 sets of eigenvectors and associated
    eigenvalues considered equal?

    My hunch is that
        1. If all corresponding eigenvectors are the same scalar
           multiple of each other.
        AND
        2. If the ratio of corresponding eigenvalues from each set is
           the same, i.e. are scalar multiples of each other
        THEN
        The results are equivalent.

    1b.  What if #1 is relaxed to say that each pair of corresponding
    eigenvector are scalar multiples but the multiplier differes
    for each pair?

    1c.  What if the multplier is the same for all pairs but sometimes
    differs in sign?

2.  When calculating the covariance matrix, does one use the
    deviation of each observations from the mean of all
    observations for the feature or the mean of all observations
    over all features.  From what I read, the first is the correct
    approach but these two packages seem to differ.

3.  Does the order of the calculated eigenvectors have any
    significance?.  It seems they are often returned sorted by
    eigenvalue.  I ask because in my data, each feature is an image
    taken at a paticular time interval after an perturbation giving
    the data an inherent ordering.  I'm concerned that if I consider
    the data after sorting, that it may be difficult to 'attribute' an
    eigenvector to a particular underlying cause (if the sort order
    changes).

4.  Can anyone point me to some data with the results of eigenvalue
    analysis for the data?  This would help a lot in testing.  Even
    better, is there a way to programatically generate test data where
    the eigenvectors/values are known?

5.  Are there any other packages to do PCA that you'd recommend?


*  The two packages are
       JAMA from NIST (http://math.nist.gov/javanumerics/jama/)
   and
       BIJ, Bio-medical Imaging in Java (http://bij.isi.uu.nl/)

   I can only get the BIJ to agree with the JAMA if the raw data has
   only 2 features and the mean of the observations is 0 (before
   analysis.)  Looking at the BIJ source code, it appears that when
   calculating the covariance matrix, the deviations are taken with
   respect to the mean of all observations.  (Also, the calculation of the
   mean itself appears suspect.)