Machine Learning AI
What You'll Do We’re advancing the use of machine learning, generative AI, and natural language processing to extract clinically relevant information from unstructured medical notes for use in oncology research. The Discovery team is helping to build these next generation research data products, developing and applying ML & LLMs to capture a complete picture of the patient journey. While most of the team are ML Engineers, Discovery has team members spanning 5 different fields, from ML to product management to research oncology. As part of our team, you will develop and validate LLM systems (including prompt engineering and data pipelines) to solve applied clinical problems and help build towards our vision of the future of machine learning at our company. Engaging with a cross-functional group of stakeholders both within Discovery and across the company, you will contribute to LLM variable development projects from scoping through to productionization and delivery. In addition, you'll also: Interface with internal scientific & clinical stakeholders to understand what data they need to conduct high quality research. Build LLM-based models to turn raw clinical data into high quality research variables, drawing on your knowledge of LLMs, prompt engineering, traditional ML, and NLP techniques to determine the right methods to use for a given problem. Work with quantitative scientists and oncologists to validate that your models can be used to generate sound scientific insights. Work cross-functionally with software engineers to productionize, scale, and monitor your models. Who You Are You're a product-focused data scientist, with experience in leveraging ML and LLMs to solve real-world problems. You’re excited to learn about oncology from our clinical stakeholders and work with them to craft LLM prompts that extract complex & nuanced clinical concepts from the medical record. You’re a kind, passionate and collaborative problem-solver who seeks and gives candid feedback, and values the chance to make an important impact. You have strong written and verbal communication skills in fluent English, needed both for cross-functional collaboration with clinical stakeholders and interpreting unstructured medical notes. You have 3+ years of relevant working experience in a technical capacity, with a focus on ML. Prior experience with LLMs is strongly preferred. You have a strong background in applying ML to solve real-world problems and a solid grasp of the underlying statistical fundamentals of ML. You have collaborated with other technical team members in a production development environment using formal version control, Python, and SQL. You have led cross-functional initiatives and excel at influencing decision-making without authority. Extra Credit You have experience with deep learning and traditional NLP methods. You have experience working with data in a healthcare setting. You have experience with the risks of bias in machine learning, health equity research/analysis or have worked with underrepresented groups in a clinical research setting.