Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2 
International Conference on Robotics and Automation 2024
A key source of brittleness for robotic systems is the presence of model uncertainty and external disturbances. This work proposes a training a state-conditioned generative model to represent the distribution of error residuals between the nominal dynamics and the actual system. We demonstrate our approach in simulations and hardware, and show that our method can learn a disturbance model that is accurate enough to enable risk-sensitive control of a quadrotor flying aggressively with an unmodelled slung load.
ICML 2025 Workshop on Multi-modal Foundational Models for Life Sciences
Learning meaningful representations of cellular states is a key problem in computational biology. Yet, the scaling behavior of single-cell representation learning models remains poorly understood. While recent work has proposed that model performance scales predictably with measurement noise, this hypothesis has only been validated with relatively small models and datasets. We demonstrate that previously observed noise-scaling behavior again consistently emerges in these large-scale models and datasets.
submited to The 2nd Workshop on Foundation Models for Science (and ICML 2026)
We present a systematic benchmark comparing gene and expression encoding strategies for single-cell foundational models by training models from scratch under controlled conditions, scaling to 10 million cells across 100 diverse datasets. Contrary to common assumptions, we find that pretrained embeddings from large protein models like ESM-2 consistently underperform task-specific learned embeddings. Our work provides clear empirical guidance for model design decisions and establishes a systematic benchmark for evaluating encoding strategies in single-cell foundational models.
submitted to Nature Biotechnology
Large genomic and imaging datasets can be used to fit models that learn representations of cellular systems, extracting informative structure from data. In other domains, model performance improves predictably with dataset size, providing a basis for allocating data and computation. In biological data, however, performance is also limited by measurement noise arising from technical factors such as molecular undersampling or imaging variability. By learning representations of single-cell genomic and imaging data, we show that noise defines a distinct axis along which performance improves predictably across tasks. This scaling follows a simple logarithmic law that is consistent across model types, tasks, and datasets, and can be derived quantitatively from a model of noise propagation. We identify robustness to noise and saturating performance as properties that vary across models and tasks.
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.
Volunteer, Polish AI Olympiad, 2024
Helped develop one of the questions (on Implicit Neural Representations) for the Polish AI Olympiad, a national competition for high school students in AI.
Volunteer Teacher, Open Avenues, 2024
Designed and delivered weekly classes for community college students across the US as a volunteer teacher with Open Avenues.