User Tools

Site Tools


rail

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rail [2023/12/05 16:35]
nigam
rail [2024/05/12 10:55] (current)
nigam
Line 1: Line 1:
 ====== Responsible AI in Healthcare ====== ====== Responsible AI in Healthcare ======
  
-In healthcare, "Standard AI" models estimate the risk of having some underlying condition or developing it in the future. Whether a model is usefulness depends on the interplay between the model's output, the intervention it triggers, and the intervention’s benefits and harms. We study this interplay for bringing AI to the clinic, safely, cost-effectively and ethically and to inform the work of the [[https://dsatshc.stanford.edu/ | Data Science Team at Stanford Healthcare]]+Our team is focused on bringing AI into clinical use, safely, ethically and cost effectively. Our work is organized in two broad work-streams.
  
-{{  :model-interplay.png?400&nolink&  }}+===== Creation and adoption of foundation models in medicine =====
  
-[[https://www.tinyurl.com/hai-blogs | Blog posts at HAI]] summarize our work in easily accessible manner. Our research stemmed from the effort [[http://stanmed.stanford.edu/2018summer/artificial-intelligence-puts-humanity-health-care.html|in improving palliative care]] using machine learning. [[https://jamanetwork.com/journals/jama/fullarticle/2748179?guestAccessKey=8cef0271-616d-4e8e-852a-0fddaa0e5101|Ensuring that machine learning models are clinically useful]] requires [[https://www.nature.com/articles/s41591-019-0651-8| estimating the hidden deployment cost of predictive models]] as well as quantifying the [[http://academic.oup.com/jamia/article/28/6/1149/6045012|impact of work capacity constraints]] on achievable benefit, estimating [[https://www.sciencedirect.com/science/article/pii/S1532046421001544|individualized utility]], and learning [[https://pubmed.ncbi.nlm.nih.gov/34350942/|optimal decision thresholds]]. Pre-empting [[https://www.nejm.org/doi/full/10.1056/NEJMp1714229|ethical challenges]] often requires keeping [[https://hai.stanford.edu/news/when-algorithmic-fairness-fixes-fail-case-keeping-humans-loop|humans in the loop]] and focus on examining the [[https://informatics.bmj.com/content/29/1/e100460|consequences of model-guided decision making]] in the presence of clinical care guidelines. +Given the high interest in using large language models (LLMs) in medicine, the [[https://jamanetwork.com/journals/jama/fullarticle/2808296|creation and use of LLMs in medicine]] needs to be actively shaped by provisioning relevant training data, specifying the desired benefits, and evaluating the benefits via testing in real-world deployments.
-----+
  
-Given the high interest in using large language models (LLMs) in medicine, the [[https://jamanetwork.com/journals/jama/fullarticle/2808296 | creation and use of LLMs in medicine]] needs to be actively shaped by provisioning relevant training data, specifying the desired benefits, and evaluating the benefits via testing in real-world deployments.+{{  :verify-benefits.png?nolink&400  }}
  
-{{  :verify-benefits.png?400&nolink&  }}+We study whether commercial language models [[https://arxiv.org/abs/2304.13714|support real-world needs]] or can follow [[https://medalign.stanford.edu/|medical instructions (MedAlign)]] that clinicians would expect them to follow. We build clinical foundation models such as [[https://www.sciencedirect.com/science/article/pii/S1532046420302653| CLMBR]], [[https://arxiv.org/abs/2301.03150| MOTOR]] and verify their benefits such as [[https://www.nature.com/articles/s41598-023-30820-8| robustness over time]], [[https://pubmed.ncbi.nlm.nih.gov/37639620/| populations]] and [[https://arxiv.org/abs/2311.11483| sites]]. we release de-identified datasets such as [[https://ehrshot.stanford.edu/| EHRSHOT]] for few-shot evaluation of foundation models and multi-modal datasets such as [[https://inspect.stanford.edu/| INSPECT]]. 
 + 
 + 
 +===== Making machine learning models clinically useful ===== 
 + 
 +Whether a classifier or prediction [[ https://jamanetwork.com/journals/jama/article-abstract/2748179 | model is useful]] in guiding care depends on the interplay between the model's output, the intervention it triggers, and the intervention’s benefits and harms. Our work stemmed from the effort [[http://stanmed.stanford.edu/2018summer/artificial-intelligence-puts-humanity-health-care.html|in improving palliative care]] using machine learning. [[https://www.tinyurl.com/hai-blogs | Blog posts at HAI]] summarize our work in easily accessible manner.  
 + 
 +{{  :model-interplay.png?400&nolink&  }}
  
-We build clinical foundation models such as [[https://www.sciencedirect.com/science/article/pii/S1532046420302653 CLMBR]], [[https://arxiv.org/abs/2301.03150 MOTOR]] and verify benefits such as [[https://www.nature.com/articles/s41598-023-30820-8 robustness over time]][[https://pubmed.ncbi.nlm.nih.gov/37639620/ | populations]] and [[https://arxiv.org/abs/2311.11483 sites]]. In addition we make available de-identified datasets such as [[https://ehrshot.stanford.edu/EHRSHOT]] for few-shot evaluation of foundation models as well as for benchmarking instruction following by commercial LLMs ([[https://medalign.stanford.edu/MedAlign]]). We also conduct research to assess whether commercial language models [[https://arxiv.org/abs/2304.13714  support real-world needs]].+We study how to quantify the [[https://www.sciencedirect.com/science/article/pii/S1532046423000400|impact of work capacity constraints]] on achievable benefitestimate [[https://www.sciencedirect.com/science/article/pii/S1532046421001544|individualized utility]]and learn [[https://pubmed.ncbi.nlm.nih.gov/34350942/|optimal decision thresholds]]. We question conventional wisdom on whether models [[https://tinyurl.com/donot-explain need to be explainable]]and [[https://www.nature.com/articles/s41591-023-02540-z |generalizable]]. We examine if consequences of using [[https://hai.stanford.edu/news/when-algorithmic-fairness-fixes-fail-case-keeping-humans-loop algorithm guided care are fair]] and how to [[https://hai.stanford.edu/news/how-do-we-ensure-healthcare-ai-useful ensure that healthcare models are useful]]. We study this interplay to guide the work of the [[https://dsatshc.stanford.edu/ | Data Science Team at Stanford Healthcare]]. 
  
----- 
  
-{{youtube>GNTIoEADfY4?small | Artificial Intelligence transforms health care}} 
  
-Russ Altman and Nigam Shah taking an in-depth look at the growing influence of “data-driven medicine.” 
  
rail.1701822942.txt.gz · Last modified: 2023/12/05 16:35 by nigam