Posted by
gankaku on
URL: http://imagej.273.s1.nabble.com/ImageJ-2-0-0-rc-11-released-tp5009074p5009119.html
Hi Mark, all,
If I also may comment on this topic... I see it similar to Gabriel. If I
would not have payed attention to the release note from Curtis and Gabriels
first mail, I actually would have missed this whole issue. I guess a lot of
users not using the mailing list are thus not informed about the "data"
collection. I see, that there might be some general advantage in the
collection concerning future funding etc., so I would not argument for a
complete removal of the feature. For me it would for example also be
interesting to see how often the tools I created where used!
Anyway, nowadays more of our "private" data are collected and send
somewhere by all our electronic devices (especially Smartphones....). But
one needs to think about tools created in best willingness and without
hidden agenda might be misused simply due to their existence by malicious
subjects, not assuming that this can happen or will be possible in this
specific case.
Therefore, I would also suggest to make the tool opt-in with an initial
popup message after the first run of Fiji/IJ2, like for SCIFIO. This
potentially "interrupts" the first workflow, but I think this is not a big
issue when you see this once in the lifetime of your current Fiji/IJ2
version. The updater actually does the same with the 3 options
"update/remind me later/never" which to me seems a good possibility for
your tool. Then everybody is informed and can make a sound decision for
her-/himself. Obviously, this will result in a drop of N in your statistics
but I guess you will still get sufficient people who contribute to that.
Just out of curiosity, how was the map existing already since long ago (
http://fiji.sc/Fiji_Usage) actually created having no such "collection"
tool (download statistics?)?
Additionally, the list of data collected is only shortly mentioned in
brackets on the release page. Is this a comprehensive list or only
exemplary. It would be nice to show the users which data are collected
exactly.
Performance wise, I had the impression that it really makes Fiji slower
(without being able to put this in numbers). Therefore, minimizing data
transfer (only at the end of a session) and collection should be the goal.
Because if IJ2 is designed for dealing with big data which need way more
calculation power (without exactly knowing how it is able to process higher
amounts faster and if this is affected by the collection) etc., potentially
slowing the performance time down by a process like the data collection
might not be advantageous.
I think this is a topic where you will never find a common consense (not
even inside this very same e-mail ;-) ), but to conclude, informing the
user and presenting options is preferrable in my opinion.
Besides that, thanks for all the effort of your team and trying to make it
performing better every day.
Jan
2014-08-11 23:53 GMT+02:00 Gabriel Landini <
[hidden email]>:
> On Monday 11 Aug 2014 14:18:11 Mark Hiner wrote:
> > > SCIFIO was opt in, but usage tracking is opt out? It does not make
> sense.
> >
> > To be clear, SCIFIO is enabled by default.. you have to uncheck a box to
> > disable SCIFIO, so it is opt out.
>
> Right, but it was impossible to miss as I had to answer the SCIFIO dialog
> when
> the update came.
> What is the problem in showing a similar dialog and let people know what
> will
> be going on?
>
> > I think there is a difference in the questions "what do you do with the
> > software" and "what do users do with the software". I don't believe we
> will
> > ever ask the former question.
>
> Mark, what you or me personally *believe* somebody will ask in the future
> does
> not matter. It is the process of getting informed consent on the data
> collection; IJ2 is assuming and makes it less obvious than it could be.
>
> > we can ask:
> > "How many times was Bio-Formats used with Java 7"?
> > we *can not* ask:
> > "how many times did Gabriel Landini run Bio-Formats?"
>
> Even with my poor knowledge of network traffic I can imagine that it might
> trivial to script something using time stamps and ip addresses of the
> uploading machine as well as plenty of emails also ip addresses from users.
> Not that I remotely think that the devel team would have the time or
> inclination to do this, but if we are talking about what is impossible, I
> suspect it is not. So whether that is potentially identifiable information
> is
> probably debatable. If there is then you would be effectively logging in a
> database their location every hour (!) IJ2 runs. Doesn't that sound a bit
> creepy?
>
> My issue was (and remains) that data collection needs to be fully informed
> before it takes place, not to be On by default.
>
> > >If this happens to be something people want to adhere to, then there is
> > > nothing to worry about as there will be lots of users opting in when
> given
> > > the chance.
>
> > I believe this is actually hard to predict.
>
> Ask the users in a similar way SCIFIO was done and you will have the
> answer.
> Then we would not be having this conversation.
>
> > If usage statistics were presented similarly - with a pop up on launch
> and
> > an options menu - my expectations for opt-in numbers would be very low.
> Not
> > because people don't want to contribute but because we created a barrier
> to
> > the process.
>
> The issue that does not seem to stick after all this typing is that IJ2
> should
> not make that decision for the users. IJ2 is not the owner of the processes
> happening in a user's computer. You need to ask, not assume, that people
> will
> be happy for their computers to contact a database every hour and letting
> it
> know they are there and doing this or that.
>
> > A more successful alternative might be, when statistics are actually
> being
> > uploaded, to display a dialog asking to proceed or not - with
> yes/no/don't
> > ask me again options. That sounds promising, but also potentially
> annoying
> > or confusing to get that pop up, and we can still expect statistics
> > reporting to drop.
>
> But if the reporting statistics drop, that would have to do. Make estimates
> instead of collecting all possible data.
>
> > So since we are not sending or storing use-specific data, and provided
> and
> > publicized the opt-out mechanism, we decided to go with the option that
> was
> > un-disruptive at the workflow level and maximized data collection.
>
> Yes, you said that before, and I am sure I am not alone thinking it is not
> the
> desirable way of doing it.
>
> > Especially given, as you mentioned, that users ultimately need to agree
> to
> > communicate with an external server to download these applications and
> > updates.
>
> But there is an obvious difference between the two situations. One is
> requesting an update. The other is broadcasting to a database.
>
> > I hope it's clear that I am not saying we are unwilling to change how
> > permissions are exposed.. but if we can circumvent that need via
> discussion
> > it would certainly be my preference. And if we do end up making any
> > changes, I would like them to be as minimally damaging to the quality of
> > the data gathering as possible.
>
> I sounds like it is preferable not to ask people about the data collection.
> That is in my view an error of judgement that can be resolved easily.
>
> > To me, there has to be actual user data being exposed to be a matter of
> > privacy. Can you clarify what you believe to be the concern here?
>
> Sure: that the process of collecting usage data is not made clear from the
> beginning and it should have informed consent before the collection starts.
>
> If there are no privacy issues, why is the function to switch it Off called
> "Privacy"?
>
> Regards
>
> Gabriel
>
> --
> ImageJ mailing list:
http://imagej.nih.gov/ij/list.html>
--
CEO: Dr. rer. nat. Jan Brocher
phone: +49 (0)6234 917 03 39
mobile: +49 (0)176 705 746 81
e-mail:
[hidden email]
info:
[hidden email]
inquiries:
[hidden email]
web: www.biovoxxel.de
--
ImageJ mailing list:
http://imagej.nih.gov/ij/list.html