Generative AI on Arm: Efficient AI Inference Course
Hands-on course with Arm University covering optimization of generative AI workloads across Arm architectures, with labs spanning mobile, cloud, and edge deployment.
Hands-on course with Arm University covering optimization of generative AI workloads across Arm architectures, with labs spanning mobile, cloud, and edge deployment.
Drop-in ternary linear layers for PyTorch with QAT and seamless BitOps deployment. Supports BitNet, TWN, and ParetoQ for 8x memory savings.
Gradio web app for real-time chat with 1.58-bit BitNet models. 24x speedup and 80% memory reduction on ARM M4 vs PyTorch FP32.
Optimized ternary matmul across ARM NEON, x86 AVX2, and CUDA backends. 16x memory reduction via 2-bit weight packing.
Test your place-recognition skills against state-of-the-art VLMs in this interactive GeoGuessr-style game. Built on Hugging Face Spaces.
Physics-based data centre simulator for training AI agents to optimize real-world operations. Seven evaluation dimensions across thermal, power, carbon, and workload management.
Adaptive ternary-quantized ViT with runtime accuracy/compute control. 5x model compression and up to 40% operation reduction.
Ternary quantization with progressive distillation achieving 69% memory reduction and 35% lower latency for visual place recognition.
Systematic evaluation of 25+ VLMs for geolocation, revealing privacy risks. Developed mitigation techniques reducing accuracy by 40%. Published at AAAI 2025.
Pure-Python stereo SLAM pipeline (KITTI-compatible) with feature tracking, stereo matching, PnP/ICP motion estimation, and bundle adjustment.
Published in arXiv preprint, 2025
This paper introduces TeTRA, a ternary transformer approach that progressively quantizes Vision Transformers to achieve significant reductions in memory consumption and inference latency, while preserving or even enhancing visual place recognition performance on resource-constrained platforms.
Recommended citation: Grainge, O., Milford, M., Bodala, I., Ramchurn, S. D., & Ehsan, S. (2025). "TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition." arXiv preprint, arXiv:2503.02511. doi:10.48550/arXiv.2503.02511
Download Paper | Download Slides
Published in arXiv preprint, 2025
TAT-VPR fuses ternary weight quantization with a learned activation-sparsity gate, giving visual SLAM systems a 5 × smaller model and up to 40 % fewer operations while retaining state-of-the-art Recall@1.
Recommended citation: Grainge, O., Milford, M., Bodala, I., Ramchurn, S. D., & Ehsan, S. (2025). "TAT-VPR: Ternary Adaptive Transformer for Dynamic and Efficient Visual Place Recognition." arXiv preprint, arXiv:2505.16447.
Download Paper | Download Slides
Published in AAAI Fall Symposium Series (FSS-25), 2025
A comprehensive evaluation of 25 state-of-the-art VLMs on image geo-localization across four benchmark datasets, revealing that current models achieve up to 61% Recall@1km on social media-like content and raising significant privacy concerns.
Recommended citation: Grainge, O.*, Waheed, S.*, Stilgoe, J., Milford, M., & Ehsan, S. (2025). "Assessing the Geolocation Capabilities, Limitations and Societal Risks of Generative Vision-Language Models." AAAI Fall Symposium Series (FSS-25), 161-168.
Download Paper | Download Slides
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.