
Robust Audio Synchronization Under Extreme Time Warping
Alignment algorithms under severe global distortion done at Harvey Mudd
93.7% alignment accuracy under 4x time warping — 31% improvement over DTW baselines. Sub-20ms latency at 44.1kHz sample rate. Evaluated across 2,400 audio pairs with synthetic and real-world distortions.
One-Shot Timbre Transfer with Mamba
Adaptive instance normalization with state-space models for music generation
4.2x faster inference than transformer baselines. FAD score of 1.83 on unseen instruments — state-of-the-art for one-shot transfer. 12M parameters, trained on 800 hours of multi-instrument audio.
Training Action-Conditioned World Models
Learning environment dynamics from agent interactions for model-based planning
World model conditioned on agent actions to predict future states. Enables model-based planning with 3.2x sample efficiency over model-free baselines on DMControl benchmarks.
Correcting a Power-of-Two Quantization Error
Found and fixed a mathematical flaw in a widely adopted rounding scheme
Identified systematic bias affecting widely used INT8 quantization formula. Fix recovers 0.4–1.2% top-1 accuracy on ImageNet without retraining. Patch adopted upstream.
Computational Intensity of DNN Training Across GPUs
Profiling computation and memory bandwidth bottlenecks at scale
Profiled 14 model architectures across A100, H100, and RTX 4090. Identified 37% average memory bandwidth underutilization. Proposed operator fusion strategy yielding 3.4x throughput.
Diffusion Transformers for Architectural Floor Plans
Generative layout synthesis in collaboration with Creative Machines Lab
Generative model conditioned on spatial constraints via FiLM. Generates structurally valid floor plans with 91% constraint satisfaction. Trained on 45K annotated layouts.
Robust Audio Synchronization Under Extreme Time Warping
Alignment algorithms under severe global distortion done at Harvey Mudd
93.7% alignment accuracy under 4x time warping — 31% improvement over DTW baselines. Sub-20ms latency at 44.1kHz sample rate. Evaluated across 2,400 audio pairs with synthetic and real-world distortions.
One-Shot Timbre Transfer with Mamba
Adaptive instance normalization with state-space models for music generation
4.2x faster inference than transformer baselines. FAD score of 1.83 on unseen instruments — state-of-the-art for one-shot transfer. 12M parameters, trained on 800 hours of multi-instrument audio.
Training Action-Conditioned World Models
Learning environment dynamics from agent interactions for model-based planning
World model conditioned on agent actions to predict future states. Enables model-based planning with 3.2x sample efficiency over model-free baselines on DMControl benchmarks.
Correcting a Power-of-Two Quantization Error
Found and fixed a mathematical flaw in a widely adopted rounding scheme
Identified systematic bias affecting widely used INT8 quantization formula. Fix recovers 0.4–1.2% top-1 accuracy on ImageNet without retraining. Patch adopted upstream.
Computational Intensity of DNN Training Across GPUs
Profiling computation and memory bandwidth bottlenecks at scale
Profiled 14 model architectures across A100, H100, and RTX 4090. Identified 37% average memory bandwidth underutilization. Proposed operator fusion strategy yielding 3.4x throughput.
Diffusion Transformers for Architectural Floor Plans
Generative layout synthesis in collaboration with Creative Machines Lab
Generative model conditioned on spatial constraints via FiLM. Generates structurally valid floor plans with 91% constraint satisfaction. Trained on 45K annotated layouts.
Robust Audio Synchronization Under Extreme Time Warping
Alignment algorithms under severe global distortion done at Harvey Mudd
93.7% alignment accuracy under 4x time warping — 31% improvement over DTW baselines. Sub-20ms latency at 44.1kHz sample rate. Evaluated across 2,400 audio pairs with synthetic and real-world distortions.
One-Shot Timbre Transfer with Mamba
Adaptive instance normalization with state-space models for music generation
4.2x faster inference than transformer baselines. FAD score of 1.83 on unseen instruments — state-of-the-art for one-shot transfer. 12M parameters, trained on 800 hours of multi-instrument audio.
Training Action-Conditioned World Models
Learning environment dynamics from agent interactions for model-based planning
World model conditioned on agent actions to predict future states. Enables model-based planning with 3.2x sample efficiency over model-free baselines on DMControl benchmarks.
Correcting a Power-of-Two Quantization Error
Found and fixed a mathematical flaw in a widely adopted rounding scheme
Identified systematic bias affecting widely used INT8 quantization formula. Fix recovers 0.4–1.2% top-1 accuracy on ImageNet without retraining. Patch adopted upstream.
Computational Intensity of DNN Training Across GPUs
Profiling computation and memory bandwidth bottlenecks at scale
Profiled 14 model architectures across A100, H100, and RTX 4090. Identified 37% average memory bandwidth underutilization. Proposed operator fusion strategy yielding 3.4x throughput.
Diffusion Transformers for Architectural Floor Plans
Generative layout synthesis in collaboration with Creative Machines Lab
Generative model conditioned on spatial constraints via FiLM. Generates structurally valid floor plans with 91% constraint satisfaction. Trained on 45K annotated layouts.