Crowdsourcing Medical Diagnosis

From P2P Foundation
Jump to navigation Jump to search


From a online debate synthesized by Jeff Howe [1]:

Alpheus Bingham—founder of InnoCentive:

"Of course, it goes without saying that many 'portions' of the overall healthcare process (including research objectives) can and should be crowdsourced for a host of reasons that would improve outcome and quality," he writes. As most of the readers of this blog realize, this is an essential component of what companies like InnoCentive, NineSigma and YourEncore do — i.e. broadcast research problems to large groups of experts and non-experts alike in the hopes of unearthing novel solutions. However, "the specific issue of crowdsourcing diagnosis or treatment is far less clear."

On the one hand, the numbers would seem to favor such a project. Here's Alph:

- "An expert (say, right 95% of the time) is wrong 5% of the time. An amateur might be wrong 20% of the time, but the chance that two amateurs are both wrong is only 20%x20% or 4%. So two "informed" amateurs consistently reading a scan or collection of lab results has a pretty good record. Hmm, seems like a clear case FOR the crowd..."

Ah, but the numbers can deceive:

- "Of course, as the number of semi-professionals or informed amateurs goes up, the chance of them ALL being wrong goes down, but the chance of getting mixed diagnoses goes up very fast. (73% say "benign" and 27% say "malignant.") What to do then? Majority rules? Supermajority required?"

When is the "vote" compelling enough to stake your treatment on it? It would be well outside the scope of a blog comment to delve deeply into this. But let's briefly return to our two amateurs. When they agree there is a 96% chance of them being right. But how often do they agree? If they are looking at a cancerous scan, with an 80% individual accuracy rate, they agree on the cancer diagnosis only 64% of the time. They split opinions (a very confusing state of affairs since they are equally likely to get it right and now you don't know who to believe) 32% of the time (that's a lot) and they both get it wrong only 4% (as we said already). There may well be some sophisticated statistical analyses that would supplement such crowdsourcing approaches -- BUT -- 'experts' or 'crowds' or 'crowds of experts,' there will remain ambiguity when dealing with judgement calls. Our penchant for certainty is just not going to get fully satisfied.

This is, to some degree, the situation we're already in (albeit with very small crowds) when several professionals are asked to interpret medical data (such as an MRI scan): Ambiguity. That said, in the end Alph makes a strong case for the exploration of a crowdsourcing effort that would entice the semi-professionals to engage in such a project: "Total non-experts (the masses referred to in the post) do NOT help matters as their input is just noise -- obscuring a signal. But the crowd of semi-experts could well be, in my opinion, desirable, and we should investigate appropriate systems and knowledge aggregation tools for its exploitation."

Finally, Daniel Reda and Alexandra Carmichael, the co-founders of the very promising, both posted useful comments. Here's Daniel:

- "What would happen if you crowdsourced interpretation or even diagnosis? Well, the consensus interpretation of 100 amateurs on your MRI would probably not be at all helpful. What you'd want is a method to select the best interpretations and have them bubble up to the top. How do you select the best interpretations? One way is to keep historical data on how accurate those predictions were once more data became available. Ideally we'd gather data on doctors' performance as well. It's not about credentials - it's about accuracy. If the doctors don't want to participate, then their judgments will look progressively weaker compared with those of a supposed amateur who was proven to be correct 99% of the time on thousands of MRI interpretations."

"It's not about credientials—it's about accuracy." (