User Tools

Site Tools


biomedin215-2011

This is an old revision of the document!


BIOMEDIN 215 DATA DRIVEN MEDICINE

With the spread of electronic health records, increasingly large data repositories of clinical and other patient derived data are being built. These databases are large and difficult for any one specialist to analyze. To find the hidden associations within such data, we review methods for large-scale data-mining on electronic medical records, methods in natural language processing and text-mining of medical records, methods for using ontologies for tagging of unstructured clinical notes.

SCHEDULE: TUE, THU 2:15 PM - 3:30 PM
LOCATION: TBD for Fall 2012
CREDITS: 3
Discussion Section: Medical School Office Building, X-228, FRI, 2:15 - 3:05

The course has four modules. The first module will review the medical data miner’s tool-kit—including the use of ontologies for data-mining. The remaining modules will review three problem areas and computational methods used in that problem area via a set of 5 lectures ending in a “mini project” as home work. Each module will cover a new application area (e.g., predicting readmission, drug safety surveillance, clinical text mining) and a new method (e.g. association rules, logistic regression). The course will use real, de-identified, large size patient datasets (millions of patients range) that are made available for a final research project associated with the course. This course is also offered in a 1 credit version (BIOMEDIN 225) which meets at the same time but does not require a final project.

Course Flyer

Course Materials

Schedule and Syllabus

(Subject to change)

The Petabyte Age Because More Isn't Just More — More Is Different
The Unreasonable Effectiveness of Data

Lecture Learning Objective Module Homework and Project Date HW Dataset 2012 Date
1 Introduction, Overview and Relevance Medical Dataminer's toolkit FS, HC 9/27/2011 9/25/2012
2 Data mining in medicine -I Medical Dataminer's toolkit 9/29/2011 9/27/2012
3 Data mining in medicine - II Medical Dataminer's toolkit 10/4/2011 10/2/2012
4 Health Care Utilization Databases + Review of relevant Ontologies Medical Dataminer's toolkit 10/6/2011 AERS data10/4/2012
-
5 Introduction to Drug Safety Surveillance Drug Safety Surveillance HW-1 out10/11/2011 10/9/2012
6 State of the art and Exemplar paper Drug Safety Surveillance 10/13/2011 10/11/2012
7 Other methods Drug Safety Surveillance 10/18/2011 10/16/2012
8 Other possible methods Drug Safety Surveillance 10/20/2011 10/18/2012
9 Project propsals Drug Safety Surveillance HW-1 due10/25/2011 10/23/2012
-
10 Introduction to Predicting Readmissions Predictive data-mining 10/27/2011 10/25/2012
11 State of the art (Readmissions) Predictive data-miningHW-2 out11/1/2011 MIMIC II10/30/2012
12 Intro. to Co-morbidities and Exemplar paper (Discharge decision / Survival) Predictive data-mining 11/3/2011 11/1/2012
13 State of the art (Discharge decision / Survival) Predictive data-mining 11/8/2011 11/6/2012
-
14 Clinical Text Mining: Goals and Key Problems Clinical Text Mining 11/10/2011 11/8/2012
15 History and Review of the state of the art Clinical Text Mining HW-2 due, HW-3 out11/15/2011 I2B2 data11/13/2012
16 i2b2 NLP challenges Clinical Text Mining 11/17/2011 11/15/2012
17 Non traditional approaches; How to improve on existing methods Clinical Text Mining HW-3 due 11/29/2011 11/27/2012
18 Wrap-up: What did we learn? What questions remain? - 12/1/2011 11/29/2012
-
19 PROJECT PRESENTATIONS Final Project12/6/2011 12/4/2012
20 PROJECT PRESENTATIONS Final Project12/8/2011 12/6/2012

Useful References

Machine learning: an algorithmic perspective, Stephen Marsland
http://www.amazon.com/Machine-Learning-Algorithmic-Perspective-Recognition/dp/1420067184

Introduction to the practice of statistics, David S. Moore, George P. McCabe
http://searchworks.stanford.edu/view/5470778
http://www.amazon.com/Introduction-Practice-Statistics-George-McCabe/dp/071676282X

The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Trevor Hastie, Robert Tibshirani and Jerome Friedman
http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Mining of Massive Datasets, Anand Rajaraman and Jeff Ullman
http://infolab.stanford.edu/~ullman/mmds.html

biomedin215-2011.1344621395.txt.gz · Last modified: 2012/08/10 10:56 by nigam