The widespread adoption of electronic health records (EHRs) has created a new source of “big data”—namely, the record of routine clinical practice—as a by-product of care. This data source offers tremendous opportunities to revolutionize healthcare in the clinic and at the bedside and to advance our understanding of medicine. This graduate class will teach you how to use EHR and other patient data for better healthcare.
The course has five modules, with four lectures in each module. The first module will review the medical data miner’s tool-kit—including the use of ontologies in data-mining and healthcare utilization databases. The remaining modules will review four problem areas and computational methods used in that problem area ending in a “mini project” as home work. Each module will cover a new application area (e.g. drug safety surveillance, predictive analytics) and new methods (e.g. association rules, logistic regression). In addition, there are 8 discussion sections that provide in depth explanation of the methods referred to in the lectures. For 2015, these discussions will be recorded and available to SCPD (and remote) students.
The course will use real, de-identified, large size patient datasets for home work projects associated with the course. This course is also offered in a 2 credit version (BIOMEDIN 225) which meets at the same time but requires only one home work, which uses a public dataset.
Prerequisites: CS 106A; familiarity with statistics and biology.
Highly recommended: STATS 216.
Recommended: one of CS 246, STATS 305, HRP 258 or CS 229.
Schedule: TUE, THU 1:30 PM - 2:50 PM
Lectures and Discussion: Skilling Auditorium (Fall 2015). Lectures and discussions are recorded
Videos: https://mvideox.stanford.edu/Course/551 (posted about two hours after the class ends)
|If nothing shows up in the space below, reload the page|
|All homework assignments will be due before the start of lecture (1:30 pm) on the day the next homework is released.|
older version when we had year end projects
Machine learning: an algorithmic perspective, Stephen Marsland
Introduction to the practice of statistics, David S. Moore, George P. McCabe
The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Trevor Hastie, Robert Tibshirani and Jerome Friedman
Mining of Massive Datasets, Anand Rajaraman and Jeff Ullman
The Petabyte Age Because More Isn't Just More — More Is Different
The Unreasonable Effectiveness of Data
A few useful things to know about machine learning ← only works for on campus access. Same content in http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf