This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
hiring [2021/02/02 14:06] jfries |
hiring [2021/02/27 15:36] (current) jfries |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
- | + | The Shah Lab is currently not hiring any RA positions. | |
- | ===== Role: Machine Learning Engineer / Data Scientist ===== | + | |
- | + | ||
- | **Description** \\ | + | |
- | We are looking | + | |
- | + | ||
- | **Responsibilities** | + | |
- | + | ||
- | * Understanding clinical requirements and translating them into technical problem statements. | + | |
- | * Communicating results and observations to technical audiences as well as clinicians in the form of visualizations, | + | |
- | * Designing and implementing machine learning models to solve real world problems. | + | |
- | * Working with live data streams as well as as large data repositories to enable training and inference of machine learning models. | + | |
- | * Developing software that interacts with various hospital IT systems. | + | |
- | + | ||
- | **Requirements** | + | |
- | + | ||
- | * 5+ years of experience in software design and development. | + | |
- | * 2+ years of hands-on experience using Python based machine learning libraries such as scikit-learn, | + | |
- | * Experience working in a Linux environment and being comfortable with UNIX command line tools. | + | |
- | * Familiarity with productivity tools like Git, Docker. | + | |
- | * Familiarity with SQL, REST, Web programming. | + | |
- | * In-depth conceptual understanding as well as hands-on experience with several supervised and unsupervised machine learning algorithms, such as Random Forests, Logistic Regression, Gradient Boosting, Neural Networks, PCA, K-means, etc. | + | |
- | + | ||
- | **Strongly preferred: | + | |
- | + | ||
- | * Prior experience with production deployment of software systems and/or machine learning systems. | + | |
- | * Educational background involving quantitative techniques (CS, EE, Math, Statistics, etc.) | + | |
- | + | ||
- | ---- | + | |
- | + | ||
- | ===== Role: Machine Learning Engineer / Data Scientist | + | |
- | + | ||
- | **Description** | + | |
- | + | ||
- | Our team has built a state-of-the-art EHR representation learning technique named CLMBR. We are looking to recruit a research assistant (RA) to assist in developing publicly releasable code for broad use of CLMBR. | + | |
- | + | ||
- | Deploying risk-stratification models in the clinic requires addressing questions about the robustness of large, pre-trained models, such as characterizing their reliance on memorization and spurious correlations, | + | |
- | + | ||
- | The successful candidate for this RA position would be supervised by research scientists who are experts in representation learning, transfer learning and weak-supervision across multiple modalities of data. The RA will be responsible for implementing code for an open source API to enable rapid prototyping and evaluation of risk-stratification models built using CLMBR from Stanford' | + | |
- | + | ||
- | **Research Focus Areas** | + | |
- | + | ||
- | * Developing robustness evaluations of EHR-based representation models | + | |
- | * Contrastive learning with multi-modal EHR data (text + tabular data) | + | |
- | + | ||
- | **Required Skills** | + | |
- | + | ||
- | * 5+ years of experience in software design and development. | + | |
- | * 2+ years of hands-on experience using Python based machine learning libraries such as scikit-learn, | + | |
- | * Strong communication skills and prior research experience required | + | |
- | * Experience working in a Linux environment and being comfortable with UNIX command line tools. | + | |
- | * Familiarity with productivity tools like Git, Docker. | + | |
- | + | ||
- | **Preferred Skills** | + | |
- | + | ||
- | * Familiarity with Google Cloud Platform (GCP) services such as BigQuery | + | |
- | * Prior experiment working with Stanford' | + | |
- | + | ||
- | **Relevant Papers** | + | |
- | + | ||
- | * CLMBR - Language models are an effective representation learning technique for electronic health record data [[https:// | + | |