This is an old revision of the document!

BIOMEDIN 215 DATA DRIVEN MEDICINE

With the spread of electronic health records, increasingly large data repositories of clinical and other patient derived data are being built. These databases are large and difficult for any one specialist to analyze. To find the hidden associations within such data, we review methods for large-scale data-mining on electronic medical records, methods in natural language processing and text-mining of medical records, methods for using ontologies for tagging of unstructured clinical notes.

SCHEDULE: TUE, THU 2:15 PM - 3:30 PM
LOCATION: TBD for Fall 2012
CREDITS: 3
Discussion Section: Medical School Office Building, X-228, FRI, 2:15 - 3:05

The course has four modules. The first module will review the medical data miner’s tool-kit—including the use of ontologies for data-mining. The remaining modules will review three problem areas and computational methods used in that problem area via a set of 5 lectures ending in a “mini project” as home work. Each module will cover a new application area (e.g., predicting readmission, drug safety surveillance, clinical text mining) and a new method (e.g. association rules, logistic regression). The course will use real, de-identified, large size patient datasets (millions of patients range) that are made available for a final research project associated with the course. This course is also offered in a 1 credit version (BIOMEDIN 225) which meets at the same time but does not require a final project.

Course Flyer

Course Materials

Go to https://coursework.stanford.edu/ and join BIOMEDIN 215
R-tutorial

Schedule and Syllabus

(Subject to change)

The Petabyte Age Because More Isn't Just More — More Is Different
The Unreasonable Effectiveness of Data

Lecture	Learning Objective	Module	Homework and Project	Date	HW Dataset	2012 Date
1	Introduction, Overview and Relevance	Medical Dataminer's toolkit	FS, HC	9/27/2011		9/25/2012
2	Data mining in medicine -I	Medical Dataminer's toolkit		9/29/2011		9/27/2012
3	Data mining in medicine - II	Medical Dataminer's toolkit		10/4/2011		10/2/2012
4	Health Care Utilization Databases + Review of relevant Ontologies	Medical Dataminer's toolkit		10/6/2011	AERS data	10/4/2012
-
5	Introduction to Drug Safety Surveillance	Drug Safety Surveillance	HW-1 out	10/11/2011		10/9/2012
6	State of the art and Exemplar paper	Drug Safety Surveillance		10/13/2011		10/11/2012
7	Other methods	Drug Safety Surveillance		10/18/2011		10/16/2012
8	Other possible methods	Drug Safety Surveillance		10/20/2011		10/18/2012
9	Project propsals	Drug Safety Surveillance	HW-1 due	10/25/2011		10/23/2012
-
10	Introduction to Predicting Readmissions	Predictive data-mining		10/27/2011		10/25/2012
11	State of the art (Readmissions)	Predictive data-mining	HW-2 out	11/1/2011	MIMIC II	10/30/2012
12	Intro. to Co-morbidities and Exemplar paper (Discharge decision / Survival)	Predictive data-mining		11/3/2011		11/1/2012
13	State of the art (Discharge decision / Survival)	Predictive data-mining		11/8/2011		11/6/2012
-
14	Clinical Text Mining: Goals and Key Problems	Clinical Text Mining		11/10/2011		11/8/2012
15	History and Review of the state of the art	Clinical Text Mining	HW-2 due, HW-3 out	11/15/2011	I2B2 data	11/13/2012
16	i2b2 NLP challenges	Clinical Text Mining		11/17/2011		11/15/2012
17	Non traditional approaches; How to improve on existing methods	Clinical Text Mining	HW-3 due	11/29/2011		11/27/2012
18	Wrap-up: What did we learn? What questions remain?	-		12/1/2011		11/29/2012
-
19	PROJECT PRESENTATIONS		Final Project	12/6/2011		12/4/2012
20	PROJECT PRESENTATIONS		Final Project	12/8/2011		12/6/2012

Useful References

Machine learning: an algorithmic perspective, Stephen Marsland
http://www.amazon.com/Machine-Learning-Algorithmic-Perspective-Recognition/dp/1420067184

Introduction to the practice of statistics, David S. Moore, George P. McCabe
http://searchworks.stanford.edu/view/5470778
http://www.amazon.com/Introduction-Practice-Statistics-George-McCabe/dp/071676282X

The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Trevor Hastie, Robert Tibshirani and Jerome Friedman
http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Mining of Massive Datasets, Anand Rajaraman and Jeff Ullman
http://infolab.stanford.edu/~ullman/mmds.html

Shah Lab

Table of Contents

BIOMEDIN 215 DATA DRIVEN MEDICINE

Course Materials

Schedule and Syllabus

Useful References

Shah Lab

User Tools

Site Tools

Table of Contents

BIOMEDIN 215 DATA DRIVEN MEDICINE

Course Materials

Schedule and Syllabus

Useful References

Page Tools