Login  Register

Re: fitting y=f(x) data to arbitrary functions in ImageJ?

Posted by Kenneth Sloan-2 on Mar 17, 2018; 9:09pm
URL: http://imagej.273.s1.nabble.com/fitting-y-f-x-data-to-arbitrary-functions-in-ImageJ-tp5020273p5020299.html

I agree - there are lots of possible models.  This was an attempt to directly replicate (with slightly different inputs) previous work in the literature, so the first task was to exactly match the model used there.  It looks like I can check off that box and move on to the next step.

DoG is a favorite for most folk who learned neuroscience or human vision in the 1970's.  So, it's a perfectly respectable starting point.  But...it's just a starting point.

One problem with this model is the implicit assumption that there ought to be some symmetry.  In this example, it's quite clear that the two Gaussians have different means (and CurveFitter confirms this).  One thing I've been thinking about is fitting various models to only HALF the curve.  Once this choice is made, notice that the possible models get dramatically simpler.  My first thought is simply a cubic, perhaps with the derivative constrained to be 0.0 at the center point - perhaps clipped at the point where the derivative once again becomes 0.0.  We're mostly interested in the central "pit".
 
But...and this is a big but...people sometimes confuse the measurement model with a model of causality.  The main point of this fitting is to find consistent measurements for the 5 points shown.  They lead to descriptors such as the DEPTH and WIDTH of the central "pit".  Depending on your taste, it might be argued that the plot demonstrates two different methods that produce highly similar results.  The question might become - in less "normal" examples, where the two methods (the GREEN and the RED) differ, which one should you believe?  And also - how significant are the differences in this one example, compared to the distribution of results in a large population.

On the other hand, the negative Gaussian may turn out to be what we really want to measure, perhaps considering the broader positive Gaussian as "noise".  That would make the DoG model very relevant.

Finally, there are biological justifications for using a model like the DoG, based on mechanisms that might have given rise to the original data.  Other models might fit the raw data better, but they might not have as much explanatory power.

As you say - more curves!  I'm about to have a fairly large dataset to play with, so that's coming.
Again, one motivation for today is to be prepared to produce answers for this new dataset that are
compatible with previously reported literature.

Given my druthers, I'm pretty happy with simply smoothing the raw data and directly measuring the 5 key points.  There are minor difficulties having to do with the fact that the answers are not necessarily unique - but these are tractable.  My primary motivation in fitting the DoG was to match previous methods.

My congratulations to the authors of CurveFitter.  It was easy to use (given your guidance), and is clearly capable of handling MUCH more difficult tasks than this one.  It did just fine, with absolutely zero hints.
Thanks again for your help.

--
Kenneth Sloan
[hidden email]
Vision is the art of seeing what is invisible to others.





> On 17 Mar 2018, at 14:05 , Michael Schmid <[hidden email]> wrote:
>
> Hi Kenneth,
>
> looking at your curves, I have the impression the 'negative' peak does not look like a Gaussian - its peak is sharper and the tails are wider than for a Gaussian.
> If this is typical for your data, you might try other shapes like a Lorentzian,
>  y = a/(1+(x-c)^2/b^2)
> If the Lorentzian is too sharp and its tails are too strong, you could also try other function.
> Both the inflection points and the maxima will change depending on the fit function, so it makes sense to try which function fits best.
> (Obviously, this should be done with many curves, not just one).
>
> Michael
> ________________________________________________________________
>
>
> On 2018-03-17 19:00, Kenneth Sloan wrote:
>> Here's a progress report, following up on all the good advice I
>> received on CurveFitter.  Again, my thanks to all who replied.  It
>> turned out to be simple as pi.
>> The attached plot is kind of busy (it's mostly for debugging), but it
>> makes the point.
>> There is a sampled function t = f(x) drawn in BLACK.  It is mostly
>> obscured by a smoothed version
>> of the same data, drawn over it in RED.  The DoG curve is drawn in GREEN.
>> I'm lazy, so I did the minimal amount of work necessary to start off
>> with, planning on adding tweaks as needed The GREEN curve is the
>> result of asking CurveFitter to converge with no hints whatsoever.
>> The userFunction has 7 parameters: mu1, sigma1, mu2, sigma2,
>> Amplitude1, Amplitude2, offset.
>> I'm very pleased with the result, so I think I'll just accept this
>> result and move on!
>> The BLACK circles, and the RED and GREEN arrows show the locations of
>> 5 key points on the curve.
>> They are, from left to right:
>> A - max(t)
>> B - min(dt/dx)
>> C - min(t)
>> D - max(dt/dx)
>> E - max(t)
>> Point C is found first, restricting the ranges for B and D.  B and D
>> further restrict the ranges of A and E.
>> The BLACK results come from a simple search of the original data.
>> (well, not all THAT simple)
>> The RED results come from the same search, after smoothing and
>> re-sampling the data
>> The GREEN results come from the same search, after creating a sampled
>> version of the DoG model
>> I told you I was lazy - I should use calculus to find the key points
>> on the DoG model, but the explicit search of a sampled signal was
>> already written, so...  Note that a major plus of the DoG model is
>> that these points are all unique.
>> I'm a bit disappointed that the smoothing didn't do more to the curve
>> - I used a Gaussian kernel that was only 7 wide (which I lifted from a
>> Google search).  Time to bite the bullet and remember how to generate
>> larger kernels.  I am pleased to note that the smoothing did fix one
>> obviously wrong answer. Now, the question is whether to believe the
>> RED or the GREEN answers.  Moving on!
>> --
>> Kenneth Sloan
>> [hidden email]
>> Vision is the art of seeing what is invisible to others.
>>> On 13 Mar 2018, at 18:04 , Kenneth Sloan <[hidden email]> wrote:
>>> Thanks for the replies.  Looks good.
>>> As for reducing the number of parameters - well, yes - except that all the parameters are important.
>>> The good news is that many of them can be estimated well enough to produce very good initial values.
>>> For example, the centers of the two Gaussians are almost certainly very close to each other, and their location is already "known" to pretty good precision.  Amplitudes fall in a narrow range.  Even the spreads have fairly standard values based on previous work.  I'm looking for fine-grained optimization of the values, not diving in with zero a priori information.  And, if the fitter produces centers that deviate too much from the "known" values, that will be a strong signal that the model is inadequate.
>>> On the other hand, it's not at all clear that the data are perfectly symmetric about the center of the two Gaussians - so, I may end up doing one fit to the left, and one to the right.
>>> Mostly, I'm doing this to compare with directly measuring 5 distinct features along the curve.  Previous work fit the function to the data and computed the locations of these features using analytic means (finding the 1 min, 2 max and min&max slope - or, said another way, the min, max, and zero-crossings of the first derivative).  So far, I have had no difficulty extracting these features directly from the raw data.  But, I don't want to count on that always being the case, and I do want to compare newly acquired data with the previous methods.
>>> My suspicion is that the direct method will work at least as well as the curve-fitting method - but I have
>>> to make a legitimate effort to do good curve fits in order to demonstrate that.  Science, you know...
>>> Careful reading of the above will reveal why I'm not an expert in curve fitting techniques.  I really
>>> appreciate the help.  Thanks again.
>>> --
>>> Kenneth Sloan
>>> [hidden email]
>>> Vision is the art of seeing what is invisible to others.
>>>> On 13 Mar 2018, at 14:30 , Michael Schmid <[hidden email]> wrote:
>>>> Hi Kenneth,
>>>> Concerning fitting an 8-parameter function:
>>>> If the fit is not linear (as in the case of a difference of Gaussians), having 8 fit parameters is a rather ambitious task, and there is a high probability that the fit will run into a local minimum or some point that looks like a local minimum to the fitting program.
>>>> It would be best to reduce the number of parameters, e.g. using a fixed ratio between the two sigma values in the Difference of Gaussians.
>>>> You also need some reasonable guess for the initial values of the parameters.
>>>> For the ImageJ CurveFitter, if there are many parameters it is very important to specify roughly how much the parameters can vary, these are the 'initialParamVariations'
>>>> If cf is the CurveFitter, you will have
>>>>  cf.doCustomFit(UserFunction userFunction, int numParams,
>>>>      String formula, double[] initialParams,
>>>>      double[] initialParamVariations, boolean showSettings
>>>> For the initialParamVariations, use e.g. 1/10th of how much the respective parameter might deviate from the initial guess (only the order of magnitude is important).
>>>> If you have many parameters and your function can be written as, e.g.
>>>>  a + b*function(x; c,d,e...)
>>>> or
>>>>  a + b*x + function(x; c,d,e)
>>>> you should also specify these parameters via
>>>>  cf.setOffsetMultiplySlopeParams(int offsetParam, int multiplyParam, int slopeParam)
>>>> where 'offsetParam' is the number of the parameter that is only an offset (in the examples above, 0 for 'a', 'multiplyParam' would be 1 for 'b' in the first example above, or 'slopeParam' would be 1 for 'b' in the second type of function above. You cannot have a 'multiplyParam' and a 'slopeParam' at the same time, set the unused one to -1.
>>>> Specifying an offsetParam and multiplyParam (or slopeParam) makes the CurveFitter calculate these parameters via linear regression, so the actual minimization does not include these parameters. In other words, you get fewer parameters, which makes the fitting much more likely to succeed.
>>>> In my experience, if you end up with 3-4 parameters (not counting the parameters eliminated by setOffsetMultiplySlopeParams), there is a good chance that the fit will work very well, with 5-6 parameters it gets difficult, and above 6 parameters the chance to get the correct result is rather low.
>>>> If you need to control the minimization process in detail, before starting the fit, you can use
>>>>  Minimizer minimizer = cf.getMinimizer()
>>>> to get access to the Minimizer that will be used and you can use the Minimizer's methods to control its behavior (e.g. allow it to do more steps than by default by minimizer.setMaxIterations, setting it to try more restarts, use different error values for more/less accurate convergence, etc.
>>>> Best see the documentation in the source code, e.g.
>>>> https://github.com/imagej/imagej1/blob/master/ij/measure/CurveFitter.java
>>>> https://github.com/imagej/imagej1/blob/master/ij/measure/Minimizer.java
>>>> -------------
>>>>> Bonus question for Java Gurus:
>>>>> How to declare the user function as a variable, call it,
>>>>> and pass it to another function?
>>>> public class MyFunction implements UserFunction {...
>>>> public double userFunction(double[] params, double x) {
>>>>  return params[0]+params[1]*x;
>>>> }
>>>> }
>>>> public class PassingClass { ...
>>>> UserFunction exampleFunction = new MyFunction(...);
>>>> otherClass.doFitting(xData, yData, exampleFunction)
>>>> }
>>>> Public class OtherClass { ...
>>>> public void doFitting(double[] xData, double[] yData,
>>>>      UserFunction userFunction) {
>>>>  CurveFitter cf = new CurveFitter(xData, yData);
>>>>  cf.doCustomFit(userFunction, /*numParams=*/2, null,
>>>>      null, null, false);
>>>> }
>>>> }
>>>> -------------
>>>> Best,
>>>> Michael
>>>> ________________________________________________________________
>>>> On 13/03/2018 18:56, Fred Damen wrote:
>>>>> Below is a routine to fit MRI Inversion Recover data for T1.
>>>>> Note:
>>>>> a) CurveFitter comes with ImageJ.
>>>>> b) Calls are made to UserFunction once for each x.
>>>>> c) If your initial guess is not close it does not seem to converge.
>>>>> Bonus question for Java Gurus:
>>>>> How to declare the user function as a variable, call it, and pass it to
>>>>> another function?
>>>>> Enjoy,
>>>>> Fred
>>>>>  private double[] nonlinearFit(double[] x, double[] y) {
>>>>>     CurveFitter cf = new CurveFitter(x, y);
>>>>>     double[] params = {1, 2*y[0], -(y[0]+y[y.length-1])};
>>>>>     cf.setMaxIterations(200);
>>>>>     cf.doCustomFit(new UserFunction() {
>>>>>        @Override
>>>>>        public double userFunction(double[] p, double x) {
>>>>>           return Math.abs( p[1] + p[2]*Math.exp(-x/p[0]) );
>>>>>           }
>>>>>        }, params.length, "", params, null, false);
>>>>>     //IJ.log(cf.getResultString());
>>>>>     return cf.getParams();
>>>>>     }
>>>>> On Tue, March 13, 2018 11:06 am, Kenneth Sloan wrote:
>>>>>> I have some simple data: samples of y=f(x) at regularly spaced discrete values
>>>>>> for x.  The data
>>>>>> is born as a simple array of y values, but I can turn that into (for example)
>>>>>> a polyline, if that
>>>>>> will help.  I'm currently doing that to draw the data as an Overlay.  The size
>>>>>> of the y array is between 500 and 1000. (think 1 y value for every integer
>>>>>> x-coordinate in an image - some y values may be recorded as "missing").
>>>>>> Is there an ImageJ tool that will fit (more or less) arbitrary functions to
>>>>>> this data?  Approximately 8 parameters.
>>>>>> The particular function I have in mind at the moment is a difference of
>>>>>> Gaussians.  The Gaussians
>>>>>> most likely have the same location (in x) - but this is not guaranteed, and
>>>>>> I'd prefer
>>>>>> to use this as a sanity check rather than impose it as a constraint.
>>>>>> Note that the context is a Java plugin - not a macro.
>>>>>> If not in ImageJ, perhaps someone could point me at a Java package that will
>>>>>> do this.  I am far
>>>>>> from an expert in curve fitting, so please be gentle.
>>>>>> If not, I can do it in R - but I prefer to do it "on the fly" while displaying
>>>>>> the image, and the Overlay.
>>>>>> --
>>>>>> Kenneth Sloan
>>>>>> [hidden email]
>>>>>> Vision is the art of seeing what is invisible to others.
>>>>>> --
>>>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>>>> --
>>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>>> --
>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>> --
>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html