Hello everyone,
I'm working with curve fitting, to smooth data but in the first place to fill invalid positions in a curve. I tend to use Gaussian equation, which works well so far. But it seems that sometimes, I get undesired results, like the one from the attached macro, where -995468 as curve point is calculated. Is this an error/bug? If this is correct, what would you suggest to avoid/remove such deflections? Just out of curiosity. Is there an equation or function to interpolate only missing positions of a curve, where each given point is kept? Any help/hint is much appreciated. Kind regards, Rainer -- Rainer M. Engel, Dipl. Digital Artist endime|ENGEL DIGITAL MEDIA Pichelsdorferstr. 143 D-13595 Berlin -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html IJ_CurveFitting-eq-12.ijm (5K) Download Attachment |
Hi Rainer,
> I tend to use Gaussian equation, which works well so far. But it seems > that sometimes, I get undesired results, like the one from the > attached macro, where -995468 as curve point is calculated. After analyzing your case, I am quite confident that this is not a bug. The reason for the "strange" fit result is the following: Your data set is essentially a flat curve with a few outliers at significantly more negative values than the rest. There are many equivalent solutions for a best fit, all of them are very narrow Gaussians that are essentially constant everywhere except for one outlier point, which is fitted very well by the Gaussian. With just one outlier point defining the Gaussian, the curve fitter can put the peak of the Gaussian essentially anywhere in between the surrounding data points; the peak does not necessarily have to be exactly at the outlier point. If the peak position is a bit off the outlier point, that point will be somewhere on the slope of the Gaussian. If the point happens to rather far from the peak of the Gaussian, the Gaussian has to be very high to accurately fit the outlier point. With your data set, the outlier is at 23, and the CurveFitter happens to put the peak of the Gaussian at 23.65, which is still far from the adjacent points. I have attached a high-resolution plot. It shows that the outlier point at 23 is exactly on fitting curve, and it makes no difference for the adjacent points whether one shifts the Gaussian a bit to the left or right (but this would make a huge difference for the Gaussian's height). By the way, there is also another local solution to your fitting problem, which is rather broad peak at c=29.5, with a width of d=14.5, but that one is worse than fitting one of the outliers by putting a very sharp Gaussian there (sum of residuals squared = 1763.3 vs. 1559.8, correlation coefficient 0.043 vs. 0.117). To avoid such problems, one could think of doing a fit with a 'penalty function' for such high values, e.g., something in the sense of maximum entropy methods, but this is not implemented in ImageJ and it would be rather difficult to do so. https://en.wikipedia.org/wiki/Principle_of_maximum_entropy By the way, the fitting process is based on minimizing the least-squares deviation with the Nelder-Mead method, which is rather robust even for badly conditioned minimization problems https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method In addition, for more robust fitting, it eliminates up to two parameters by linear regression (in case of the Gaussian, the baseline and height of the Gaussian, so only two fit parameters remain to be determined by the Nelder-Mead method). --- In your case, maybe you could detect whether the r-squared of the Gaussian fit is too bad (say, below 0.4), or if the width ('d' parameter) of the Gaussian is too narrow, and if so, use a different fit function such as a fourth-order polynomial? It won't give you a much better fit, but it can't run into the problem of a Gaussian trying to catch a single outlier point. --- The second of your questions is easier to answer: > Just out of curiosity. Is there an equation or function to interpolate > only missing positions of a curve, where each given point is kept? You can do linear interpolation https://en.wikipedia.org/wiki/Linear_interpolation cubic interpolation, splines, etc. https://en.wikipedia.org/wiki/Spline_(mathematics) In ImageJ, if you have equidistant points simply create an image with a height of one pixel and the data as pixel values, then you can use the macro function getPixel(x, 0) to get the interpolated value (linear by default; setOption("bicubic, true) for cubic interpolation. ImageJ uses splines for 'rounding' segmented line rois, but I am not aware of a macro interface for it's built-in SplineFitter. One could probably access it via JavaScript. Michael ________________________________________________________________ On 24/06/2017 16:37, Rainer M. Engel wrote: > Hello everyone, > > I'm working with curve fitting, to smooth data but in the first place to > fill invalid positions in a curve. > > I tend to use Gaussian equation, which works well so far. But it seems > that sometimes, I get undesired results, like the one from the attached > macro, where -995468 as curve point is calculated. > > Is this an error/bug? > > If this is correct, what would you suggest to avoid/remove such deflections? > > Just out of curiosity. Is there an equation or function to interpolate > only missing positions of a curve, where each given point is kept? > > Any help/hint is much appreciated. > > Kind regards, > Rainer > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html GaussianFit.png (12K) Download Attachment |
Dear Michael,
thank you very much for your exceedingly informative reply. I do not have that strong math background but I could understand most of it. I tested your suggestions for both. The resulting data array is checked against min/max of the source data. So if there might be a peak like described I switch to another equation. The interpolation via image is also useful. Maybe I'll incorporate that method too. Thanks again.. Rainer Am 26.06.2017 um 12:01 schrieb Michael Schmid: > Hi Rainer, > >> I tend to use Gaussian equation, which works well so far. But it seems >> that sometimes, I get undesired results, like the one from the >> attached macro, where -995468 as curve point is calculated. > > After analyzing your case, I am quite confident that this is not a bug. > The reason for the "strange" fit result is the following: > Your data set is essentially a flat curve with a few outliers at > significantly more negative values than the rest. There are many > equivalent solutions for a best fit, all of them are very narrow > Gaussians that are essentially constant everywhere except for one > outlier point, which is fitted very well by the Gaussian. > With just one outlier point defining the Gaussian, the curve fitter can > put the peak of the Gaussian essentially anywhere in between the > surrounding data points; the peak does not necessarily have to be > exactly at the outlier point. If the peak position is a bit off the > outlier point, that point will be somewhere on the slope of the > Gaussian. If the point happens to rather far from the peak of the > Gaussian, the Gaussian has to be very high to accurately fit the outlier > point. > > With your data set, the outlier is at 23, and the CurveFitter happens to > put the peak of the Gaussian at 23.65, which is still far from the > adjacent points. I have attached a high-resolution plot. It shows that > the outlier point at 23 is exactly on fitting curve, and it makes no > difference for the adjacent points whether one shifts the Gaussian a bit > to the left or right (but this would make a huge difference for the > Gaussian's height). > > By the way, there is also another local solution to your fitting > problem, which is rather broad peak at c=29.5, with a width of d=14.5, > but that one is worse than fitting one of the outliers by putting a very > sharp Gaussian there (sum of residuals squared = 1763.3 vs. 1559.8, > correlation coefficient 0.043 vs. 0.117). > > To avoid such problems, one could think of doing a fit with a 'penalty > function' for such high values, e.g., something in the sense of maximum > entropy methods, but this is not implemented in ImageJ and it would be > rather difficult to do so. > https://en.wikipedia.org/wiki/Principle_of_maximum_entropy > > By the way, the fitting process is based on minimizing the least-squares > deviation with the Nelder-Mead method, which is rather robust even for > badly conditioned minimization problems > https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method > In addition, for more robust fitting, it eliminates up to two parameters > by linear regression (in case of the Gaussian, the baseline and height > of the Gaussian, so only two fit parameters remain to be determined by > the Nelder-Mead method). > --- > > In your case, maybe you could detect whether the r-squared of the > Gaussian fit is too bad (say, below 0.4), or if the width ('d' > parameter) of the Gaussian is too narrow, and if so, use a different fit > function such as a fourth-order polynomial? It won't give you a much > better fit, but it can't run into the problem of a Gaussian trying to > catch a single outlier point. > > --- > > The second of your questions is easier to answer: >> Just out of curiosity. Is there an equation or function to interpolate >> only missing positions of a curve, where each given point is kept? > > You can do linear interpolation > https://en.wikipedia.org/wiki/Linear_interpolation > cubic interpolation, splines, etc. > https://en.wikipedia.org/wiki/Spline_(mathematics) > > In ImageJ, if you have equidistant points simply create an image with a > height of one pixel and the data as pixel values, then you can use the > macro function > getPixel(x, 0) > to get the interpolated value (linear by default; setOption("bicubic, > true) for cubic interpolation. > > ImageJ uses splines for 'rounding' segmented line rois, but I am not > aware of a macro interface for it's built-in SplineFitter. One could > probably access it via JavaScript. > > > Michael > ________________________________________________________________ > > > On 24/06/2017 16:37, Rainer M. Engel wrote: >> Hello everyone, >> >> I'm working with curve fitting, to smooth data but in the first place to >> fill invalid positions in a curve. >> >> I tend to use Gaussian equation, which works well so far. But it seems >> that sometimes, I get undesired results, like the one from the attached >> macro, where -995468 as curve point is calculated. >> >> Is this an error/bug? >> >> If this is correct, what would you suggest to avoid/remove such >> deflections? >> >> Just out of curiosity. Is there an equation or function to interpolate >> only missing positions of a curve, where each given point is kept? >> >> Any help/hint is much appreciated. >> >> Kind regards, >> Rainer >> >> > > -- > ImageJ mailing list: http://imagej.nih.gov/ij/list.html -- Rainer M. Engel, Dipl. Digital Artist endime|ENGEL DIGITAL MEDIA -- ImageJ mailing list: http://imagej.nih.gov/ij/list.html |
Free forum by Nabble | Edit this page |