Re: ImageJ 2.0.0-rc-11 released

Posted by Mark Hiner-2 on
URL: http://imagej.273.s1.nabble.com/ImageJ-2-0-0-rc-11-released-tp5009074p5009103.html

> SCIFIO was opt in, but usage tracking is opt out? It does not make sense.
>

To be clear, SCIFIO is enabled by default.. you have to uncheck a box to
disable SCIFIO, so it is opt out. We publicized the SCIFIO toggle from
within the ImageJ UI because it can affect the actual operation of ImageJ
by producing different (and/or incorrect) results (and it's a beta product
which is being consumed by a complicated compatibility layer to interact
with ImageJ 1.x classes).

>but logging plugin usage statistics in individual computers (i.e. what you
do with the software?) without user agreeing seems intrusive and over the
top.

I think there is a difference in the questions "what do you do with the
software" and "what do users do with the software". I don't believe we will
ever ask the former question.

With the way our database was designed, literally just storing plugin
execution counts (and not parameters)

we can ask:
 "How many times was Bio-Formats used with Java 7"?

we *can not* ask:
 "how many times did Gabriel Landini run Bio-Formats?"

>Logging updates and downloads is one thing that could be quoted when
applying for funding

This value of this information is questionable. If this was just the core
imagej.jar that's probably sufficient. But we're talking about a
significant number of components that make up Fiji, made potentially
endless via update sites and plugins. Downloads and updates just don't tell
you the whole picture.

Using Bio-Formats as an example:

Bio-Formats 5.0.2 currently ships with Fiji. But I can't say that the
number of downloads of Fiji represent use of Bio-Formats by the scientific
imaging community. I can't say the number of times a Bio-Formats .jar is
updated via the Fiji updater is representative of use either. Those numbers
aren't bad, but they don't really mean anything.

But now, with usage statistics, we can look at the number of times the
Bio-Formats importer was used, or File > Open with SCIFIO and the
Bio-Formats plugin was used. This at least gives us an approximate picture
of how our software is being used.

>If this happens to be something people want to adhere to, then there is
nothing to worry about as there will be lots of users opting in when given
the chance.

I believe this is actually hard to predict. We put in what we thought was
an appropriate system of warnings and explanations for when and how to
disable SCIFIO, for example, and failed miserably - resulting in a
frustration and confusion for users and a wonderful spike in bug reporting.

If usage statistics were presented similarly - with a pop up on launch and
an options menu - my expectations for opt-in numbers would be very low. Not
because people don't want to contribute but because we created a barrier to
the process.

A more successful alternative might be, when statistics are actually being
uploaded, to display a dialog asking to proceed or not - with yes/no/don't
ask me again options. That sounds promising, but also potentially annoying
or confusing to get that pop up, and we can still expect statistics
reporting to drop.

So since we are not sending or storing use-specific data, and provided and
publicized the opt-out mechanism, we decided to go with the option that was
un-disruptive at the workflow level and maximized data collection.
Especially given, as you mentioned, that users ultimately need to agree to
communicate with an external server to download these applications and
updates.

>Good, so please use this as an opportunity to change the way the program
gets user permission to do it.

I hope it's clear that I am not saying we are unwilling to change how
permissions are exposed.. but if we can circumvent that need via discussion
it would certainly be my preference. And if we do end up making any
changes, I would like them to be as minimally damaging to the quality of
the data gathering as possible.

>Given privacy (or the lack of...) is such a hot topic at the moment it
would seem appropriate to do things right.

From what I know of privacy issues, these are primarily concerning the
leaking of things like e-mails, passwords, usernames or credit card
numbers. As we are only storing numeric counts tied to plugin names and the
previously-mentioned information, there is nothing to leak.

To me, there has to be actual user data being exposed to be a matter of
privacy. Can you clarify what you believe to be the concern here?

>Incidentally I also found that there seems to be no way of recording in
the macro language the Privacy Opt Out option. Can this be added too so it
can be included in the StartupMacros file?

Would you mind opening an issue on the imagej-usage github page:
https://github.com/imagej/imagej-usage/issues ?

I hope this helped clarify our thought process and reasoning a bit.

Thanks,
Mark

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html