ImageJ

Costes thresholding and Manders Colocalization Coefficients in Fiji/JaCoP

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

Birgit Möller

Costes thresholding and Manders Colocalization Coefficients in Fiji/JaCoP

Hi all,
I have a question regarding colocalization analysis. I was playing around a bit with Fiji's Coloc_2 and the JaCoP plugin. Amongst others I analyzed Manders Colocalization Coefficients for a pair of my images combined with automatic thresholding according to Costes. I noticed that in both plugins it makes a difference whether I calculate colocalization for image A against B or vice versa. I guess this is due to the fact that Costes thresholds are different depending on which image is chosen as the first. However, I am lacking an intuitive explanation why Manders colocalization, or to be more precise Costes thresholds, should behave like this. Of course there is a mathematical explanation why this happens (which is linked to the linear regression being part of Costes algorithm), but intuitively I would have expected that thresholds should be the same and colocalization results be commutative, i.e. that Manders coefficients are just swapped by changing the order of the images. Does anyone have an explanation why this idea is wrong and colocalization should/could be non-symmetric? This is of course linked to the question of which image to select as the first one if order matters. Is there any rule for that?
Cheers,

Birgit

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

Tom Kazimiers

Re: Costes thresholding and Manders Colocalization Coefficients in Fiji/JaCoP

Hi Birgit,

On Thu, Aug 15, 2013 at 03:18:18AM -0400, Birgit Möller wrote:

> [...] it makes a difference whether I calculate colocalization for
> image A against B or vice versa. I guess this is due to the fact that
> Costes thresholds are different depending on which image is chosen as
> the first. However, I am lacking an intuitive explanation why Manders
> colocalization, or to be more precise Costes thresholds, should behave
> like this. Of course there is a mathematical explanation why this
> happens (which is linked to the linear regression being part of Costes
> algorithm), but intuitively I would have expected that thresholds
> should be the same and colocalization results be commutative, i.e.
> that Manders coefficients are just swapped by changing the order of
> the images. Does anyone have an explanation why this idea is wrong and
> colocalization should/could be non-symmetric? This is of course linked
> to the question of which image to select as the first one if order
> matters. Is there any rule for that?

The *unthresholded* Manders coefficients M1(A,B) and M2(A,B) should indeed
be the same as M2(B,A) and M1(B,A). Coloc 2 should do exactly that. This
is also what the mathematical formulation suggests: Let R_i and G_i be
the grey values of voxel i of the first and second image. Then the
Manders coefficients are:

M1 = Sum(R_i_coloc)/Sum(R_i),
M2 = Sum(G_i_coloc)/Sum(G_i),

where R_i_coloc = R_i if G_i > 0, R_i_coloc = 0 if G_i = 0,
where G_i_coloc = G_i if R_i > 0, G_i_coloc = 0 if R_i = 0.

However, just like you said, things are different for the thresholded
version of M1 and M2. There, a threshold for each channel exists, T_r
and T_g and the previous equations of R_i_coloc and G_i_coloc change:

R_i_coloc = R_i if G_i > 0 and R_i >= T_r,
G_i_coloc = G_i if R_i > 0 and G_i >= T_g.

So yes, the unsymmetric behaviour originates from T_r and T_g and
depends on how these are calculated. This is done in the following way,
described in more detail in Costes' paper:

In a coordinate system the X axis represents the threshold of channel 1
and the Y axis represents the threshold of channel 2. Meaningful as
threshold is therefore basically the first quadrant where both axes
reach from the image data type's minimum to its maximum.

Next, the slope as well as the y-intercept of a regression line is
calculated. This determines the ratio between both channel's thresholds.
This should be symmetric and mirrored at the x=y line (as you can see in
the scatter plot of Coloc 2's result window).

To find good thresholds, the algorithm starts at the "top" of the
regression line and goes down until some termination criteria is met,
with "top" being the half between channel one's minimum and maximum
value. Going done that line, the thresholds are used in Pearson's R
value calculation on the channel one and two images. The threshold for
channel two is is determined by the value of the regression line at
x=threshold one. If Pearson's R is below zero the regression went to
far and the threshold is increased half way to the last threshold. If
not, the next candidate point on the regression line is chosen by
lowering channel one's threshold by half the distance to the last
threshold candidate (for the first step, the last value is channel one's
maximum). This is done until the difference between the current
threshold and the last threshold candidate is below one (or a maximum
number of iteration is reached).

As an example, below are the regression steps for the clown sample
image's blue (A) and green (B) channels. The "ThM1" and "ThM2" variables
are the actual threshold values along the regression line. The "Th1" and
"Th2" values are the threshold values actually used. This distinction is
made because the image's data type might not support all threshold
calculated values. In fact, this distinction is currently missing in the
Coloc 2 version available in Fiji and I just fixed it---thanks pointing
me to it!In the currently available version, an overflow happens due to
that: When obtaining negative threshold values, the threshold values
used for comparing against the image data, got high again (comparable to
modulo...). Anyway, I'll commit this fix soon (and ask Johannes to
rebuild and publish a new version of Coloc 2).

A-B:
ThM1: 117.0 ThM2: 167.0
ThM1: 59.0 ThM2: 94.0
ThM1: 29.0 ThM2: 56.0
ThM1: 15.0 ThM2: 39.0
ThM1: 7.0 ThM2: 29.0
ThM1: 4.0 ThM2: 25.0
ThM1: 5.0 ThM2: 26.0
ThM1: 6.0 ThM2: 27.0

Th1: 117 Th2: 167 R: 0.799439570187088
Th1: 59 Th2: 94 R: 0.6632496911761505
Th1: 29 Th2: 56 R: 0.5938737496445203
Th1: 15 Th2: 39 R: 0.3223429912079611
Th1: 7 Th2: 29 R: 0.049602542237266276
Th1: 4 Th2: 25 R: -0.038579831443613474
Th1: 5 Th2: 26 R: -0.019124428435020106
Th1: 6 Th2: 27 R: 0.0052275864081140514

B-A:
ThM1: 128.0 ThM2: 86.0
ThM1: 64.0 ThM2: 35.0
ThM1: 32.0 ThM2: 10.0
ThM1: 16.0 ThM2: -3.0
ThM1: 8.0 ThM2: -9.0
ThM1: 4.0 ThM2: -12.0
ThM1: 2.0 ThM2: -14.0
ThM1: 1.0 ThM2: -15.0

Th1: 128 Th2: 86 R: 0.7228650368812359
Th1: 64 Th2: 35 R: 0.624966176923806
Th1: 32 Th2: 10 R: 0.1412482937813529
Th1: 16 Th2: 0 R: 0.4092932076472922
Th1: 8 Th2: 0 R: 0.2861263999905735
Th1: 4 Th2: 0 R: 0.30601527024083536
Th1: 2 Th2: 0 R: 0.26365092589158734

The fact that there a different starting thresholds for each combination
and the inversion regression line leads to a different pattern on how
different thresholds are tried. However, it should be possible to find a
pattern to go down the regression line, regardless of the channel
ordering. If I am not mistaken, this should then produce the same
thresholds for both combinations. The current pattern is: decrease the
current threshold candidate in each iteration by half the distance to
the last threshold candidate. If Person's R is below zero, go up again
half the distance. We could change that into using the distance on the
regression line itself, and not in the threshold dimension of image one
(like done now). Do you see any downsides of this?

An alternative might be, that we don't allow Person's R values for the
regression that go up again (if not coming from a negative value). For
instance, in the A-B log from above, we see:

Th1: 29 Th2: 56 R: 0.5938737496445203
Th1: 15 Th2: 39 R: 0.3223429912079611
Th1: 7 Th2: 29 R: 0.049602542237266276

On the B-A log this is a similar area:

Th1: 64 Th2: 35 R: 0.624966176923806
Th1: 32 Th2: 10 R: 0.1412482937813529
Th1: 16 Th2: 0 R: 0.4092932076472922

On the B-A side, Person's R is going up again. And therefore going over
the minima where A-B stops.

So for the current implementation, the thresholds are only independent
of order if the regression start value is the same for both ways and if
the pattern of moving down the regression line produces the same
threshold combinations for both, A-B and B-A. Otherwise, they will be
different for both combinations.

Now to not confuse terms, there is also another method used by Coloc 2
which is based on a paper of Costes: A statistical significance test. It
looks at image one in terms of a set of (hyper-)cubes. Those blocks get
shuffled around a couple of times (user setting) and then Person's R
value is calculated with the shuffled image one against the unshuffled
image two in each iteration. In the end Coloc 2 will show the mean and
variance of those R values as well as Costes P value. This is the
quantile of the original Pearson's R value (with unshuffled images) in
the distribution of the shuffle results. It tells you the likelihood
that the original R value is part of the shuffled distribution. And
therefore how meaningful the original value might be.

Since Costes shuffles only one image and collects Person's R values
against the other channel, it is also dependent on what is image one and
image two. The shuffled distribution might differ and therefore, Costes
P value could differ, too.

I hope this clears things up a bit.

Cheers,
Tom

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

Birgit Möller

Re: Costes thresholding and Manders Colocalization Coefficients in Fiji/JaCoP

In reply to this post by Birgit Möller

Hi Tom,
thanks for this very detailed answer!

On Fri, 16 Aug 2013 15:09:57 +0200, Tom Kazimiers <[hidden email]> wrote:

>Hi Birgit,
>
>On Thu, Aug 15, 2013 at 03:18:18AM -0400, Birgit M�ller wrote:
>> [...] it makes a difference whether I calculate colocalization for
>> image A against B or vice versa. I guess this is due to the fact that
>> Costes thresholds are different depending on which image is chosen as
>> the first. However, I am lacking an intuitive explanation why Manders
>> colocalization, or to be more precise Costes thresholds, should behave
>> like this. Of course there is a mathematical explanation why this
>> happens (which is linked to the linear regression being part of Costes
>> algorithm), but intuitively I would have expected that thresholds
>> should be the same and colocalization results be commutative, i.e.
>> that Manders coefficients are just swapped by changing the order of
>> the images. Does anyone have an explanation why this idea is wrong and
>> colocalization should/could be non-symmetric? This is of course linked
>> to the question of which image to select as the first one if order
>> matters. Is there any rule for that?
>
>The *unthresholded* Manders coefficients M1(A,B) and M2(A,B) should indeed
>be the same as M2(B,A) and M1(B,A). Coloc 2 should do exactly that. This
>is also what the mathematical formulation suggests: Let R_i and G_i be
>the grey values of voxel i of the first and second image. Then the
>Manders coefficients are:
>
> M1 = Sum(R_i_coloc)/Sum(R_i),
> M2 = Sum(G_i_coloc)/Sum(G_i),
>
> where R_i_coloc = R_i if G_i > 0, R_i_coloc = 0 if G_i = 0,
> where G_i_coloc = G_i if R_i > 0, G_i_coloc = 0 if R_i = 0.
>
>However, just like you said, things are different for the thresholded
>version of M1 and M2. There, a threshold for each channel exists, T_r
>and T_g and the previous equations of R_i_coloc and G_i_coloc change:
>
> R_i_coloc = R_i if G_i > 0 and R_i >= T_r,
> G_i_coloc = G_i if R_i > 0 and G_i >= T_g.
>
>So yes, the unsymmetric behaviour originates from T_r and T_g and
>depends on how these are calculated. This is done in the following way,
>described in more detail in Costes' paper:
>
>In a coordinate system the X axis represents the threshold of channel 1
>and the Y axis represents the threshold of channel 2. Meaningful as
>threshold is therefore basically the first quadrant where both axes
>reach from the image data type's minimum to its maximum.

>Next, the slope as well as the y-intercept of a regression line is
>calculated. This determines the ratio between both channel's thresholds.
>This should be symmetric and mirrored at the x=y line (as you can see in
>the scatter plot of Coloc 2's result window).

You mean that swapping the images causes the regression line to be mirrored at the line x=y? Yes, I fully agree. Anyway - maybe not important for you, but for the JaCoP people - this behaviour cannot be reproduced in JaCoP. I think this is due to the fact that they do not apply an orthogonal regression, but an ordinary linear least squares approach with yields completely independent results for doing a regression A against B vs. B against A. So the differences in the results for A vs. B and B vs. A are much larger in JaCoP.

>To find good thresholds, the algorithm starts at the "top" of the
>regression line and goes down until some termination criteria is met,
>with "top" being the half between channel one's minimum and maximum
>value. Going done that line, the thresholds are used in Pearson's R
>value calculation on the channel one and two images. The threshold for
>channel two is is determined by the value of the regression line at
>x=threshold one. If Pearson's R is below zero the regression went to
>far and the threshold is increased half way to the last threshold. If
>not, the next candidate point on the regression line is chosen by
>lowering channel one's threshold by half the distance to the last
>threshold candidate (for the first step, the last value is channel one's
>maximum). This is done until the difference between the current
>threshold and the last threshold candidate is below one (or a maximum
>number of iteration is reached).
>
>As an example,
>[...]
>The fact that there a different starting thresholds for each combination
>and the inversion regression line leads to a different pattern on how
>different thresholds are tried. However, it should be possible to find a
>pattern to go down the regression line, regardless of the channel
>ordering. If I am not mistaken, this should then produce the same
>thresholds for both combinations.

Ok, I see. But, just to clarify, this means that you actually implemented some kind of speed-up heuristic for finding the thresholds which causes the differences in the thresholds, right? I think that the algorithm as originally described in Costes paper would more or less naively check all thresholds on the line starting with the channel maximum of the first image and decreasing the threshold by one in each step. And following this scheme, which is of course far more time-consuming, the resulting thresholds should be the same. Do you agree?

> The current pattern is: decrease the
>current threshold candidate in each iteration by half the distance to
>the last threshold candidate. If Person's R is below zero, go up again
>half the distance. We could change that into using the distance on the
>regression line itself, and not in the threshold dimension of image one
>(like done now). Do you see any downsides of this?

No, if in addition the corresponding starting point on the mirrored regression line is used this should yield equivalent results.

>An alternative might be, that we don't allow Person's R values for the
>regression that go up again (if not coming from a negative value). For
>instance, in the A-B log from above, we see:
>
> Th1: 29 Th2: 56 R: 0.5938737496445203
> Th1: 15 Th2: 39 R: 0.3223429912079611
> Th1: 7 Th2: 29 R: 0.049602542237266276
>
>On the B-A log this is a similar area:
>
> Th1: 64 Th2: 35 R: 0.624966176923806
> Th1: 32 Th2: 10 R: 0.1412482937813529
> Th1: 16 Th2: 0 R: 0.4092932076472922
>
>On the B-A side, Person's R is going up again. And therefore going over
>the minima where A-B stops.
>
>So for the current implementation, the thresholds are only independent
>of order if the regression start value is the same for both ways and if
>the pattern of moving down the regression line produces the same
>threshold combinations for both, A-B and B-A. Otherwise, they will be
>different for both combinations.
>[...]
>
>I hope this clears things up a bit.

Yes, it does, thanks again :-)

>Cheers,
>Tom

Cheers,

Birgit

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

Tom Kazimiers

Re: Costes thresholding and Manders Colocalization Coefficients in Fiji/JaCoP

Hi Birgit,

On Fri, Aug 16, 2013 at 09:57:19AM -0400, Birgit Möller wrote:
> thanks for this very detailed answer!

you are welcome. :-)

> On Fri, 16 Aug 2013 15:09:57 +0200, Tom Kazimiers <[hidden email]> wrote:
> >On Thu, Aug 15, 2013 at 03:18:18AM -0400, Birgit Möller wrote:
> >> [...] it makes a difference whether I calculate colocalization for
> >> image A against B or vice versa. I guess this is due to the fact that
> >> Costes thresholds are different depending on which image is chosen as
> >> the first. However, I am lacking an intuitive explanation why Manders
> >> colocalization, or to be more precise Costes thresholds, should behave
> >> like this. Of course there is a mathematical explanation why this
> >> happens (which is linked to the linear regression being part of Costes
> >> algorithm), but intuitively I would have expected that thresholds
> >> should be the same and colocalization results be commutative, i.e.
> >> that Manders coefficients are just swapped by changing the order of
> >> the images. Does anyone have an explanation why this idea is wrong and
> >> colocalization should/could be non-symmetric? This is of course linked
> >> to the question of which image to select as the first one if order
> >> matters. Is there any rule for that?
> >
> >The *unthresholded* Manders coefficients M1(A,B) and M2(A,B) should indeed
> >be the same as M2(B,A) and M1(B,A). Coloc 2 should do exactly that. This
> >is also what the mathematical formulation suggests: Let R_i and G_i be
> >the grey values of voxel i of the first and second image. Then the
> >Manders coefficients are:
> >
> > M1 = Sum(R_i_coloc)/Sum(R_i),
> > M2 = Sum(G_i_coloc)/Sum(G_i),
> >
> > where R_i_coloc = R_i if G_i > 0, R_i_coloc = 0 if G_i = 0,
> > where G_i_coloc = G_i if R_i > 0, G_i_coloc = 0 if R_i = 0.
> >
> >However, just like you said, things are different for the thresholded
> >version of M1 and M2. There, a threshold for each channel exists, T_r
> >and T_g and the previous equations of R_i_coloc and G_i_coloc change:
> >
> > R_i_coloc = R_i if G_i > 0 and R_i >= T_r,
> > G_i_coloc = G_i if R_i > 0 and G_i >= T_g.
> >
> >So yes, the unsymmetric behaviour originates from T_r and T_g and
> >depends on how these are calculated. This is done in the following way,
> >described in more detail in Costes' paper:
> >
> >In a coordinate system the X axis represents the threshold of channel 1
> >and the Y axis represents the threshold of channel 2. Meaningful as
> >threshold is therefore basically the first quadrant where both axes
> >reach from the image data type's minimum to its maximum.
>
> >Next, the slope as well as the y-intercept of a regression line is
> >calculated. This determines the ratio between both channel's thresholds.
> >This should be symmetric and mirrored at the x=y line (as you can see in
> >the scatter plot of Coloc 2's result window).
>
> You mean that swapping the images causes the regression line to be
> mirrored at the line x=y? Yes, I fully agree. Anyway - maybe not
> important for you, but for the JaCoP people - this behaviour cannot be
> reproduced in JaCoP. I think this is due to the fact that they do not
> apply an orthogonal regression, but an ordinary linear least squares
> approach with yields completely independent results for doing a
> regression A against B vs. B against A. So the differences in the
> results for A vs. B and B vs. A are much larger in JaCoP.

I, too, think it would be good for auto the thresholding used with
colocalization to yield the same results for A-B and B-A.

> >To find good thresholds, the algorithm starts at the "top" of the
> >regression line and goes down until some termination criteria is met,
> >with "top" being the half between channel one's minimum and maximum
> >value. Going done that line, the thresholds are used in Pearson's R
> >value calculation on the channel one and two images. The threshold for
> >channel two is is determined by the value of the regression line at
> >x=threshold one. If Pearson's R is below zero the regression went to
> >far and the threshold is increased half way to the last threshold. If
> >not, the next candidate point on the regression line is chosen by
> >lowering channel one's threshold by half the distance to the last
> >threshold candidate (for the first step, the last value is channel one's
> >maximum). This is done until the difference between the current
> >threshold and the last threshold candidate is below one (or a maximum
> >number of iteration is reached).
> >
> >As an example,
> >[...]
> >The fact that there a different starting thresholds for each combination
> >and the inversion regression line leads to a different pattern on how
> >different thresholds are tried. However, it should be possible to find a
> >pattern to go down the regression line, regardless of the channel
> >ordering. If I am not mistaken, this should then produce the same
> >thresholds for both combinations.
>
> Ok, I see. But, just to clarify, this means that you actually
> implemented some kind of speed-up heuristic for finding the thresholds
> which causes the differences in the thresholds, right?

Yes, this correct.

> I think that the algorithm as originally described in Costes paper
> would more or less naively check all thresholds on the line starting
> with the channel maximum of the first image and decreasing the
> threshold by one in each step. And following this scheme, which is of
> course far more time-consuming, the resulting thresholds should be the
> same. Do you agree?

I agree. I could easily add a checkbox to the plugin's setup dialog to
ask the user if (s)he wants to use the speeded-up version or the naive
one. Do you think this would be a good addition?

> > The current pattern is: decrease the
> >current threshold candidate in each iteration by half the distance to
> >the last threshold candidate. If Person's R is below zero, go up again
> >half the distance. We could change that into using the distance on the
> >regression line itself, and not in the threshold dimension of image one
> >(like done now). Do you see any downsides of this?
>
> No, if in addition the corresponding starting point on the mirrored
> regression line is used this should yield equivalent results.

Yes. So in case we we would always need to start at the point where we would
catch the maximum values of each channel, at least when we start at the
"top". This is the point where the threshold of the channel with the
largest maximum value is at exactly at that point, i.e. it is mirrored
as well. More general (if we don't start at the "top"), it is the
corresponding starting point, just like you said.

And do you think this should be optional? I think it shouldn't, but we
should enforce this behavior. Do you agree? For this is wouldn't matter
if the naive or the speeded-up regression would be used. It would,
however, make the regression more robust.

Cheers,
Tom

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html