The Many Promises of AI
AI is still in the discovery phase of where and how it can be useful. But this still fledgling state has not stopped the many assertions of its wonderfulness, or future wonderfulness, or possible wonderfulness. Here is a sample of statements about AI.
With respect to predictions relevant to transplants we read: a very promising step to bring years of work on artificial intelligence… to transplant patients.
For a kidney disease product: “a technology that could help diagnose”.
From a general discussion: “advances in clinical analytics and machine learning have the potential to drive medical discovery at a pace never seen before”, and “regardless of any current uncertainty, the potential of AI is difficult to ignore.”
From another general discussion: “artificial intelligence is poised to become a transformational force in healthcare”, “artificial intelligence may be a welcome addition to the patient engagement and monitoring arenas”, and “using artificial intelligence to enhance the ability to identify deterioration, suggest that sepsis is taking hold, or sense the development of complications can significantly improve outcomes and may reduce costs related to hospital-acquired condition penalties.”
Note the caveat words in each of these quotes: “could help”, “have the potential”, “potential difficult to ignore”, “is poised”, “may be”, “may reduce costs”. The reason for such words and phrases is that we don’t actually know if AI either in particular realizations or in general will do any of these great things, and there are few studies that demonstrate that it can. In this regard in a recent AHRQ webinar on their CDS artifact project the five rights of CDS were presented (Five being a preferred but arbitrary number for short lists.) These are the right information presented to the right person in the right format, via the right channel and at the right time. To this list I would add with the right outcome.
Here it is important to distinguish actual outcome studies from opinion surveys such as those sometimes reported by HiMSS. The challenge for any AI system is the actually demonstration that its use has a positive effect on patient outcomes. This includes knowing the population to which it applies, which is constrained by the population in the data sets used to train it. In addition, we need to know how many parameters drive the answers, and when additional parameters might alter the results. We also need to resolve how physicians are supposed to use AI provided results. The simplistic answer is to use their own judgement which is another way to say that the AI result cannot be relied on. In this regard we can note that for machine learning based systems the physician cannot recreate the logic used because there really is no logic. This is different from algorithmic AI (eg practice guidelines) where the logic flow is typically simple and reviewable. All a machine learning system is doing is saying that the cases it learned from had this input data and these results. We might note here that marketed AI systems generally have “locked” decision engines. That is there is no opportunity for field learning unless wrong results are somehow collected and reported back to the developer, who then re-educates the system and pushes an “upgrade” out to its customers. This might be a reason for AI to exist in “the cloud” where it can be changed for all users.
If a physician does rely on the AI result and it turns out to be wrong, they will be said to have not properly used the result, and that they violated the typical disclaimer of don’t rely on me. The AI developer may have some explaining to do as to why the system came up with the wrong answer, but they will have the disclaimer to fall back on.
All of this makes it appropriate to approach AI with care if not skepticism. In this regard I was once in a taxi in San Francisco on my way to the airport when the driver began to explain that the local “tech” scene was so wonderful because no one tells anyone that something is bad idea. Of course, there actually are bad ideas, and badly executed decent ideas. As an engineer I learned that serious design review, including by other than the developers, is an essential step for moving from concept to a product that actually does what you want it to do, assuming that the concept is a good one. Such questioning is also a key element of risk management. Maybe this isn’t so important if you are creating another dating app, bad dates aside. But it is important in healthcare. We don’t want sketchy medical devices that haven’t been carefully designed and properly validated. Why would the standard for AI products be any different?