apham

Robust Audio Synchronization Under Extreme Time Warping

Alignment algorithms under severe global distortion done at Harvey Mudd

93.7% alignment accuracy under 4x time warping — 31% improvement over DTW baselines. Sub-20ms latency at 44.1kHz sample rate. Evaluated across 2,400 audio pairs with synthetic and real-world distortions.

One-Shot Timbre Transfer with Mamba

Adaptive instance normalization with state-space models for music generation

4.2x faster inference than transformer baselines. FAD score of 1.83 on unseen instruments — state-of-the-art for one-shot transfer. 12M parameters, trained on 800 hours of multi-instrument audio.

Training Action-Conditioned World Models

Learning environment dynamics from agent interactions for model-based planning

World model conditioned on agent actions to predict future states. Enables model-based planning with 3.2x sample efficiency over model-free baselines on DMControl benchmarks.

Correcting a Power-of-Two Quantization Error

Found and fixed a mathematical flaw in a widely adopted rounding scheme

Identified systematic bias affecting widely used INT8 quantization formula. Fix recovers 0.4–1.2% top-1 accuracy on ImageNet without retraining. Patch adopted upstream.

Computational Intensity of DNN Training Across GPUs

Profiling computation and memory bandwidth bottlenecks at scale

Profiled 14 model architectures across A100, H100, and RTX 4090. Identified 37% average memory bandwidth underutilization. Proposed operator fusion strategy yielding 3.4x throughput.

Diffusion Transformers for Architectural Floor Plans

Generative layout synthesis in collaboration with Creative Machines Lab

Generative model conditioned on spatial constraints via FiLM. Generates structurally valid floor plans with 91% constraint satisfaction. Trained on 45K annotated layouts.

Robust Audio Synchronization Under Extreme Time Warping

Alignment algorithms under severe global distortion done at Harvey Mudd

93.7% alignment accuracy under 4x time warping — 31% improvement over DTW baselines. Sub-20ms latency at 44.1kHz sample rate. Evaluated across 2,400 audio pairs with synthetic and real-world distortions.

One-Shot Timbre Transfer with Mamba

Adaptive instance normalization with state-space models for music generation

4.2x faster inference than transformer baselines. FAD score of 1.83 on unseen instruments — state-of-the-art for one-shot transfer. 12M parameters, trained on 800 hours of multi-instrument audio.

Training Action-Conditioned World Models

Learning environment dynamics from agent interactions for model-based planning

World model conditioned on agent actions to predict future states. Enables model-based planning with 3.2x sample efficiency over model-free baselines on DMControl benchmarks.

Correcting a Power-of-Two Quantization Error

Found and fixed a mathematical flaw in a widely adopted rounding scheme

Identified systematic bias affecting widely used INT8 quantization formula. Fix recovers 0.4–1.2% top-1 accuracy on ImageNet without retraining. Patch adopted upstream.

Computational Intensity of DNN Training Across GPUs

Profiling computation and memory bandwidth bottlenecks at scale

Profiled 14 model architectures across A100, H100, and RTX 4090. Identified 37% average memory bandwidth underutilization. Proposed operator fusion strategy yielding 3.4x throughput.

Diffusion Transformers for Architectural Floor Plans

Generative layout synthesis in collaboration with Creative Machines Lab

Generative model conditioned on spatial constraints via FiLM. Generates structurally valid floor plans with 91% constraint satisfaction. Trained on 45K annotated layouts.

Robust Audio Synchronization Under Extreme Time Warping

Alignment algorithms under severe global distortion done at Harvey Mudd

93.7% alignment accuracy under 4x time warping — 31% improvement over DTW baselines. Sub-20ms latency at 44.1kHz sample rate. Evaluated across 2,400 audio pairs with synthetic and real-world distortions.

One-Shot Timbre Transfer with Mamba

Adaptive instance normalization with state-space models for music generation

4.2x faster inference than transformer baselines. FAD score of 1.83 on unseen instruments — state-of-the-art for one-shot transfer. 12M parameters, trained on 800 hours of multi-instrument audio.

Training Action-Conditioned World Models

Learning environment dynamics from agent interactions for model-based planning

World model conditioned on agent actions to predict future states. Enables model-based planning with 3.2x sample efficiency over model-free baselines on DMControl benchmarks.

Correcting a Power-of-Two Quantization Error

Found and fixed a mathematical flaw in a widely adopted rounding scheme

Identified systematic bias affecting widely used INT8 quantization formula. Fix recovers 0.4–1.2% top-1 accuracy on ImageNet without retraining. Patch adopted upstream.

Computational Intensity of DNN Training Across GPUs

Profiling computation and memory bandwidth bottlenecks at scale

Profiled 14 model architectures across A100, H100, and RTX 4090. Identified 37% average memory bandwidth underutilization. Proposed operator fusion strategy yielding 3.4x throughput.

Diffusion Transformers for Architectural Floor Plans

Generative layout synthesis in collaboration with Creative Machines Lab

Generative model conditioned on spatial constraints via FiLM. Generates structurally valid floor plans with 91% constraint satisfaction. Trained on 45K annotated layouts.