Login  Register

Re: Curve Fitting issue/question

Posted by Michael Schmid on Jun 26, 2017; 10:01am
URL: http://imagej.273.s1.nabble.com/Curve-Fitting-issue-question-tp5018961p5018967.html

Hi Rainer,

 > I tend to use Gaussian equation, which works well so far. But it seems
 > that sometimes, I get undesired results, like the one from the
 > attached macro, where -995468 as curve point is calculated.

After analyzing your case, I am quite confident that this is not a bug.
The reason for the "strange" fit result is the following:
Your data set is essentially a flat curve with a few outliers at
significantly more negative values than the rest. There are many
equivalent solutions for a best fit, all of them are very narrow
Gaussians that are essentially constant everywhere except for one
outlier point, which is fitted very well by the Gaussian.
With just one outlier point defining the Gaussian, the curve fitter can
put the peak of the Gaussian essentially anywhere in between the
surrounding data points; the peak does not necessarily have to be
exactly at the outlier point. If the peak position is a bit off the
outlier point, that point will be somewhere on the slope of the
Gaussian. If the point happens to rather far from the peak of the
Gaussian, the Gaussian has to be very high to accurately fit the outlier
point.

With your data set, the outlier is at 23, and the CurveFitter happens to
put the peak of the Gaussian at 23.65, which is still far from the
adjacent points. I have attached a high-resolution plot. It shows that
the outlier point at 23 is exactly on fitting curve, and it makes no
difference for the adjacent points whether one shifts the Gaussian a bit
to the left or right (but this would make a huge difference for the
Gaussian's height).

By the way, there is also another local solution to your fitting
problem, which is rather broad peak at c=29.5, with a width of d=14.5,
but that one is worse than fitting one of the outliers by putting a very
sharp Gaussian there (sum of residuals squared = 1763.3 vs. 1559.8,
correlation coefficient 0.043 vs. 0.117).

To avoid such problems, one could think of doing a fit with a 'penalty
function' for such high values, e.g., something in the sense of maximum
entropy methods, but this is not implemented in ImageJ and it would be
rather difficult to do so.
   https://en.wikipedia.org/wiki/Principle_of_maximum_entropy

By the way, the fitting process is based on minimizing the least-squares
deviation with the Nelder-Mead method, which is rather robust even for
badly conditioned minimization problems
   https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
In addition, for more robust fitting, it eliminates up to two parameters
by linear regression (in case of the Gaussian, the baseline and height
of the Gaussian, so only two fit parameters remain to be determined by
the Nelder-Mead method).
---

In your case, maybe you could detect whether the r-squared of the
Gaussian fit is too bad (say, below 0.4), or if the width ('d'
parameter) of the Gaussian is too narrow, and if so, use a different fit
function such as a fourth-order polynomial? It won't give you a much
better fit, but it can't run into the problem of a Gaussian trying to
catch a single outlier point.

---

The second of your questions is easier to answer:
 > Just out of curiosity. Is there an equation or function to interpolate
 > only missing positions of a curve, where each given point is kept?

You can do linear interpolation
   https://en.wikipedia.org/wiki/Linear_interpolation
cubic interpolation, splines, etc.
   https://en.wikipedia.org/wiki/Spline_(mathematics)

In ImageJ, if you have equidistant points simply create an image with a
height of one pixel and the data as pixel values, then you can use the
macro function
   getPixel(x, 0)
to get the interpolated value (linear by default; setOption("bicubic,
true) for cubic interpolation.

ImageJ uses splines for 'rounding' segmented line rois, but I am not
aware of a macro interface for it's built-in SplineFitter. One could
probably access it via JavaScript.


Michael
________________________________________________________________


On 24/06/2017 16:37, Rainer M. Engel wrote:

> Hello everyone,
>
> I'm working with curve fitting, to smooth data but in the first place to
> fill invalid positions in a curve.
>
> I tend to use Gaussian equation, which works well so far. But it seems
> that sometimes, I get undesired results, like the one from the attached
> macro, where -995468 as curve point is calculated.
>
> Is this an error/bug?
>
> If this is correct, what would you suggest to avoid/remove such deflections?
>
> Just out of curiosity. Is there an equation or function to interpolate
> only missing positions of a curve, where each given point is kept?
>
> Any help/hint is much appreciated.
>
> Kind regards,
> Rainer
>
>
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

GaussianFit.png (12K) Download Attachment