Posted by
Jim Cant on
Jun 26, 2007; 5:26pm
URL: http://imagej.273.s1.nabble.com/Need-advice-about-Principal-Component-Analysis-PCA-tp3698972.html
Hi,
I hava a few questions (below) about Principal Component Analysis (PCA) which
I am hoping someone will help me with. I ask this because I'm trying two
PCA packages* and they give different results. My fear is that I
know just enough to be dangerous.
I haven't been able to find answere either on-line or in any of the
linear algegra books in our library.
Thanks for your help; it is greatly appreciated.
Jim Cant
I apologize for the length of the questions; I opted for clarity rather than brevity.
1. Under what conditions are 2 sets of eigenvectors and associated
eigenvalues considered equal?
My hunch is that
1. If all corresponding eigenvectors are the same scalar
multiple of each other.
AND
2. If the ratio of corresponding eigenvalues from each set is
the same, i.e. are scalar multiples of each other
THEN
The results are equivalent.
1b. What if #1 is relaxed to say that each pair of corresponding
eigenvector are scalar multiples but the multiplier differes
for each pair?
1c. What if the multplier is the same for all pairs but sometimes
differs in sign?
2. When calculating the covariance matrix, does one use the
deviation of each observations from the mean of all
observations for the feature or the mean of all observations
over all features. From what I read, the first is the correct
approach but these two packages seem to differ.
3. Does the order of the calculated eigenvectors have any
significance?. It seems they are often returned sorted by
eigenvalue. I ask because in my data, each feature is an image
taken at a paticular time interval after an perturbation giving
the data an inherent ordering. I'm concerned that if I consider
the data after sorting, that it may be difficult to 'attribute' an
eigenvector to a particular underlying cause (if the sort order
changes).
4. Can anyone point me to some data with the results of eigenvalue
analysis for the data? This would help a lot in testing. Even
better, is there a way to programatically generate test data where
the eigenvectors/values are known?
5. Are there any other packages to do PCA that you'd recommend?
* The two packages are
JAMA from NIST (
http://math.nist.gov/javanumerics/jama/)
and
BIJ, Bio-medical Imaging in Java (
http://bij.isi.uu.nl/)
I can only get the BIJ to agree with the JAMA if the raw data has
only 2 features and the mean of the observations is 0 (before
analysis.) Looking at the BIJ source code, it appears that when
calculating the covariance matrix, the deviations are taken with
respect to the mean of all observations. (Also, the calculation of the
mean itself appears suspect.)