Sam Gelman

Hi, I'm Sam! I research deep learning methods for protein design. I have a strong background in computer science, machine learning, and biology, which has allowed me to excel in this field. I am passionate about using deep learning to better understand protein structure and function, and I have developed innovative algorithms and models in this area. Let's connect!

Featured talk

Play Video

Featured preprint

Biophysics-based protein language models for protein engineering

Sam Gelman, Bryce Johnson, Chase Freschlin, Sameer D’Costa, Anthony Gitter, Philip A. Romero. bioRxiv (2024).

Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure, and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose Mutational Effect Transfer Learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure, and energetics. We finetune METL on experimental sequence-function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity, and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL’s ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering.

Featured publication

Neural networks to learn protein sequence–function relationships from deep mutational scanning data

Sam Gelman, Sarah A. Fahlberg, Pete Heinzelman, Philip A. Romero, Anthony Gitter. Proceedings of the National Academy of Sciences (2021).

Understanding the relationship between protein sequence and function is necessary to design new and useful proteins with applications in bioenergy, medicine, and agriculture. The mapping from protein sequence to function is highly complex, making it challenging to predict how sequence changes will affect a protein’s behavior and properties. We show that neural networks can learn the sequence–function mapping from large protein datasets. Neural networks are appealing for this task because they can learn complicated relationships from data, make few assumptions about the nature of the sequence–function relationship, and can learn general rules that apply across the length of the protein sequence. We demonstrate that learned models can be applied to design new proteins with properties that exceed natural sequences.

About Me

Education

2023
University of Wisconsin-Madison

Ph.D. Computer Science

I earned a Ph.D. in Computer Science from the University of Wisconsin-Madison. I was advised by Anthony Gitter and Philip Romero. My research focused on deep learning methods for protein engineering. I was fortunate to receive two distinguished fellowships, including a pre-doctoral fellowship from the PhRMA Foundation and a short-term traineeship from UW-Madison's Genomic Sciences Training Program.

2016
George Mason University

M.S. Computer Science

I obtained an M.S. in Computer Science from George Mason University in 2016. I was advised by Zoran Duric and Naomi Lynn Gerber. My research focused on methods for tracking human movement with depth cameras, and my master's thesis is titled A method for estimating motions of contours with an application to gait recognition. I received the Outstanding Graduate Teaching Assistant award for my efforts assisting the teaching of CS 321: Software Engineering.

2014
George Mason University

B.S. Computer Science

I obtained a B.S. in Computer Science from George Mason University in 2014. I graduated from the Honors College and received several awards, including the Schwartzstein Best Freshmen Research Paper Scholarship, the Student Excellence Award, and Outstanding Undergraduate Teaching Assistant.

Experience

2023-Present
Morgridge Institute

Machine Learning Scientist

I am continuing my protein engineering research with the Gitter Lab as a machine learning scientist.
2017-2023
University of Wisconsin-Madison

Graduate Research Assistant

I was a graduate research assistant in the Gitter Lab at the University of Wisconsin-Madison.

  • Researched machine learning methods for protein engineering
  • Implemented custom algorithms, data processing pipelines, and machine learning frameworks
  • Utilized high-throughput computing clusters to accelerate GPU and CPU-based workflows
  • Communicated research to diverse audiences in talks and manuscripts
  • Collaborated with multi-disciplinary teams including computer scientists and chemists
  • Stayed current with new research in the area
2015-2016
U.S. Naval Research Laboratory

Student Research Scientist

I was a student research scientist the U.S. Naval Research Laboratory while obtaining my master's degree from George Mason University.
  • Researched novel method for tracking motions of contours
  • Applied method for gait recognition with depth cameras
2014
National Institutes of Health

Research Scientist Intern

I interned at the NIH, National Institute of Diabetes and Digestive and Kidney Diseases, after completing my undergraduate degree.
  • Developed computer vision system for tracking lab mice
  • Designed custom graphical tools for efficiently annotating video

Contact

Let's get in touch!