User Tools

Site Tools


biomedin215

BIOMEDIN 215 DATA DRIVEN MEDICINE

With the spread of electronic health records, increasingly large data repositories of clinical and other patient derived data are being built. These databases are large and difficult for any one specialist to analyze. To find the hidden associations within such data, we review methods for large-scale data-mining on electronic medical records, methods in natural language processing and text-mining of medical records, methods for using ontologies for tagging of unstructured clinical notes.

SCHEDULE: TUE, THU 2:15 PM - 3:30 PM
LOCATION: Huang Engineerig Center 18 for Fall 2014
CREDITS: 3
Discussion Section: Will now be in class on Thursday! The course lecture that would have been on Thursday will be made as a video lecture.
Office hours:

  • Wed 3:05 - 4:05 pm
  • Thursday 3:30 - 4:30 PM (after Thur lecture)

TAs: David Moskowitz (dmosk AT stanford.edu) and Erika Strandberg (estrandb AT stanford.edu)

The course has four modules. The first module will review the medical data miner’s tool-kit—including the use of ontologies for data-mining. The remaining modules will review three problem areas and computational methods used in that problem area via a set of 3-5 lectures ending in a “mini project” as home work. Each module will cover a new application area (e.g. drug safety surveillance, clinical text mining) and a new method (e.g. association rules, logistic regression). In addition, there are 9 discussion sections that provide in depth explanation of the methods referred to in the lectures. For 2014, these discussion sections will be recorded and available to SCPD (and remote) students.

The course will use real, de-identified, large size patient datasets (millions of patients range) that are made available for home work projects associated with the course.

This course is also offered in a 2 credit version (BIOMEDIN 225) which meets at the same time but requires only one home work.

Course Flyer

Schedule and Syllabus

If nothing shows up in the space below, reload the page

All homework assignments will be due before the start of lecture (2:15 pm) on the day the next homework is released.

Course Materials

Useful References

Machine learning: an algorithmic perspective, Stephen Marsland
http://www.amazon.com/Machine-Learning-Algorithmic-Perspective-Recognition/dp/1420067184

Introduction to the practice of statistics, David S. Moore, George P. McCabe
http://searchworks.stanford.edu/view/5470778
http://www.amazon.com/Introduction-Practice-Statistics-George-McCabe/dp/071676282X

The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Trevor Hastie, Robert Tibshirani and Jerome Friedman
http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Mining of Massive Datasets, Anand Rajaraman and Jeff Ullman
http://infolab.stanford.edu/~ullman/mmds.html

The Petabyte Age Because More Isn't Just More — More Is Different
The Unreasonable Effectiveness of Data
A few useful things to know about machine learning ← only works for on campus access. Same content in http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf

biomedin215.txt · Last modified: 2014/09/19 11:56 by nigam