Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

portfolio

VSLAM: Stereo Visual SLAM Pipeline

Pure-Python stereo SLAM pipeline (KITTI-compatible) with feature tracking, stereo matching, PnP/ICP motion estimation, and bundle adjustment.

publications

TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition

Published in arXiv preprint, 2025

This paper introduces TeTRA, a ternary transformer approach that progressively quantizes Vision Transformers to achieve significant reductions in memory consumption and inference latency, while preserving or even enhancing visual place recognition performance on resource-constrained platforms.

Recommended citation: Grainge, O., Milford, M., Bodala, I., Ramchurn, S. D., & Ehsan, S. (2025). "TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition." arXiv preprint, arXiv:2503.02511. doi:10.48550/arXiv.2503.02511
Download Paper | Download Slides

TAT-VPR: Ternary Adaptive Transformer for Dynamic and Efficient Visual Place Recognition

Published in arXiv preprint, 2025

TAT-VPR fuses ternary weight quantization with a learned activation-sparsity gate, giving visual SLAM systems a 5 × smaller model and up to 40 % fewer operations while retaining state-of-the-art Recall@1.

Recommended citation: Grainge, O., Milford, M., Bodala, I., Ramchurn, S. D., & Ehsan, S. (2025). "TAT-VPR: Ternary Adaptive Transformer for Dynamic and Efficient Visual Place Recognition." arXiv preprint, arXiv:2505.16447.
Download Paper | Download Slides

Assessing the Geolocation Capabilities, Limitations and Societal Risks of Generative Vision-Language Models

Published in AAAI Fall Symposium Series (FSS-25), 2025

A comprehensive evaluation of 25 state-of-the-art VLMs on image geo-localization across four benchmark datasets, revealing that current models achieve up to 61% Recall@1km on social media-like content and raising significant privacy concerns.

Recommended citation: Grainge, O.*, Waheed, S.*, Stilgoe, J., Milford, M., & Ehsan, S. (2025). "Assessing the Geolocation Capabilities, Limitations and Societal Risks of Generative Vision-Language Models." AAAI Fall Symposium Series (FSS-25), 161-168.
Download Paper | Download Slides

talks

teaching