Preview

Journal of Instrument Engineering

Advanced search
Open Access Open Access  Restricted Access Subscription Access

Grounding Keypoint Descriptors into 3D-Gaussian Splatting for Visual Localization in Dynamic Indoor/Outdoor Environments

https://doi.org/10.17586/0021-3454-2025-68-9-781-791

Abstract

Robust visual localization in real-world conditions remains a challenging task, particularly in the presence of dynamic objects and transient distractors. While neural scene representations such as 3D Gaussian Splatting (3DGS) or NeRF offer compact encoding of scene geometry and appearance, they are sensitive to static world assumption due to their reliance on photometric consistency. In this work, we present a robust visual localization framework that leverage 3DGS with semantic-aware masking strategy to improve accuracy in dynamic scenes. Our approach extends GSplatLoc, which is a two-stage pipeline: first integrate dense and lightweight keypoint descriptors from the XFeat network into the 3DGS representation, enabling efficient 2D-3D matching for coarse pose estimation. To mitigate the impact of dynamic distractors, we incorporate semantic masks generated from a classifier that utilizes a pre-trained diffusion model to exclude inconsistent regions during 3D modeling. In the second stage, the initial pose is refined using a rendering-based photometric alignment loss. Experiments on both indoor and outdoor dynamic benchmarks demonstrate that our method achieves superior performance compared to baseline method in challenging dynamic environments.

About the Authors

M. Mohrat
ITMO University
Russian Federation

Malik Mohrat — Post-Graduate Student; Faculty of Control Systems and Robotics

St. Petersburg



G. K. Sidorov
ITMO University
Russian Federation

Gennady K. Sidorov — Graduate Student; IFaculty of Control Systems and Robotics

St. Petersburg



D. D. Gridusov
ITMO University
Russian Federation

Denis D. Gridusov — Bachelor Student; Faculty of Control Systems and Robotics

St. Petersburg

 



S. A. Kolyubin
ITMO University
Russian Federation

Sergey A. Kolyubin — Dr. Sci., Professor; ITMO University, Faculty of Control Systems and Robotics

St. Petersburg

 



References

1. Dong Z., Zhang G., Jia J., and Bao H. IEEE 12th International Conference on Computer Vision, Sep. 2009, pp. 1538– 1545, DOI: 10.1109/ICCV.2009.5459273.

2. Heng L. et al. International Conference on Robotics and Automation (ICRA), May 2019, pp. 4695–4702, DOI: 10.1109/ICRA.2019.8793949.

3. Mildenhall B., Srinivasan P.P., Tancik M., Barron J.T., Ramamoorthi R., and Ng R. Commun. ACM, 2022, no. 1(65), pp. 99–106, DOI: 10.1145/3503250.

4. Kerbl B., Kopanas G., Leimkühler T., and Drettakis G. ACM Trans Graph, 2023, no. 4(42), pp. 139.

5. Sabour S., Vora S., Duckworth D., Krasin I., Fleet D.J., and Tagliasacchi A. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 20626–20636, http://openaccess.thecvf.com/content/CVPR2023/html/Sabour_RobustNeRF_Ignoring_Distractors_With_Robust_Losses_CVPR_2023_paper.html.

6. Tang L., Jia M., Wang Q., Phoo C.P., and Hariharan B. Adv. Neural Inf. Process. Syst., 2023, vol. 36, pp. 1363–1389.

7. Martin-Brualla R., Radwan N., Sajjadi M.S., Barron J.T., Dosovitskiy A., and Duckworth D. Proceedings of the IEEE/ CVF conference on computer vision and pattern recognition, 2021, pp. 7210–7219, https://openaccess.thecvf.com/content/CVPR2021/html/Martin-Brualla_NeRF_in_the_Wild_Neural_Radiance_Fields_for_Unconstrained_Photo_CVPR_2021_paper.html?ref=labelbox.ghost.io.

8. Ren W., Zhu Z., Sun B., Chen J., Pollefeys M., and Peng S. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 8931–8940, https://openaccess.thecvf.com/content/CVPR2024/html/Ren_NeRF_On-the-go_Exploiting_Uncertainty_for_Distractor-free_NeRFs_in_the_Wild_CVPR_2024_paper.html.

9. Oquab M. et al. arXiv: arXiv:2304.07193, Feb. 02, 2024, DOI: 10.48550/arXiv.2304.07193.

10. Dahmani H., Bennehar M., Piasco N., Roldão L., and Tsishkou D. Computer Vision — ECCV 2024, Lecture Notes in Computer Science, Cham, Springer Nature Switzerland, 2025, vol. 15134, pp. 325–340, DOI: 10.1007/978-3-03173116-7_19.

11. Zhang D., Wang C., Wang W., Li P., Qin M., and Wang H. Computer Vision — ECCV 2024, Lecture Notes in Computer Science, Cham, Springer Nature Switzerland, 2025, vol. 15134, pp. 341–359, DOI: 10.1007/978-3-031-73116-7_20.

12. Wang Y., Wang J., and Qi Y. arXiv: arXiv:2406.02407, Jun. 04, 2024, DOI: 10.48550/arXiv.2406.02407.

13. Zhou Q., Maximov M., Litany O., and Leal-Taixé L. Computer Vision — ECCV 2024, Lecture Notes in Computer Science, Cham, Springer, Nature Switzerland, 2025, vol. 15082, pp. 108–127, DOI: 10.1007/978-3-031-72691-0_7.

14. Sabour S. et al. ACM Trans. Graph., 2025, no. 2(44), pp. 1–11, DOI: 10.1145/3727143.

15. Chen S., Li X., Wang Z., and Prisacariu V.A. Computer Vision — ECCV 2022, Lecture Notes in Computer Science, Cham, Springer Nature Switzerland, 2022, vol. 13670, pp. 1–17, DOI: 10.1007/978-3-031-20080-9_1.

16. Chen S. et al. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 20987–20996, http://openaccess.thecvf.com/content/CVPR2024/html/Chen_Neural_Refinement_for_Absolute_Pose_Regression_with_Feature_Synthesis_CVPR_2024_paper.html.

17. Yen-Chen L., Florence P., Barron J.T., Rodriguez A., Isola P., and Lin T.-Y. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2021, pp. 1323–1330, https://ieeexplore.ieee.org/abstract/document/9636708/.

18. Kobayashi S., Matsumoto E., and Sitzmann V. Adv. Neural Inf. Process. Syst., 2022, vol. 35, pp. 23311–23330.

19. Tschernezki V., Laina I., Larlus D., and Vedaldi A. International Conference on 3D Vision (3DV), IEEE, 2022, pp. 443–453, https://ieeexplore.ieee.org/abstract/document/10044452/.

20. Zhao B., Yang L., Mao M., Bao H., and Cui Z. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, pp. 7450–7459, https://ojs.aaai.org/index.php/AAAI/article/view/28576.

21. Sun Y. et al. arXiv: arXiv:2312.09031, Mar. 20, 2024, DOI: 10.48550/arXiv.2312.09031.

22. Botashev K., Pyatov V., Ferrer G., and Lefkimmiatis S. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2024, pp. 5664–5671, https://ieeexplore.ieee.org/abstract/document/10801919/.

23. DeTone D., Malisiewicz T., and Rabinovich A. Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 224–236, https://openaccess.thecvf.com/content_cvpr_2018_workshops/w9/html/DeTone_SuperPoint_Self-Supervised_Interest_CVPR_2018_paper.html.

24. Dusmanu M. et al. Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2019, pp. 8092–8101, http://openaccess.thecvf.com/content_CVPR_2019/html/Dusmanu_D2-Net_A_Trainable_CNN_for_Joint_Description_and_Detection_of_CVPR_2019_paper.html.

25. Revaud J., De Souza C., Humenberger M., and Weinzaepfel P. Adv. Neural Inf. Process. Syst., 2019, vol. 32, https://proceedings.neurips.cc/paper/2019/hash/3198dfd0aef271d22f7bcddd6f12f5cb-Abstract.html.

26. Lindenberger P., Sarlin P.-E., and Pollefeys M. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 17627–17638, http://openaccess.thecvf.com/content/ICCV2023/html/Lindenberger_LightGlue_Local_Feature_Matching_at_Light_Speed_ICCV_2023_paper.html.

27. Sun J., Shen Z., Wang Y., Bao H., and Zhou X. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 8922–8931, http://openaccess.thecvf.com/content/CVPR2021/html/Sun_LoFTR_Detector-Free_Local_Feature_Matching_With_Transformers_CVPR_2021_paper.html.

28. Potje G., Cadar F., Araujo A., Martins R., and Nascimento E.R. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 2682–2691, http://openaccess.thecvf.com/content/CVPR2024/html/otje_XFeat_Accelerated_Features_for_Lightweight_Image_Matching_CVPR_2024_paper.html.

29. Lindenberger P., Sarlin P.-E., Larsson V., and Pollefeys M. Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 5987–5997, http://openaccess.thecvf.com/content/ICCV2021/html/Lindenberger_Pixel-Perfect_Structure-From-Motion_With_Featuremetric_Refinement_ICCV_2021_paper.html.

30. Sidorov G., Mohrat M., Gridusov D., Rakhimov R., and Kolyubin S. arXiv: arXiv:2409.16502, Mar. 20, 2025, DOI: 10.48550/arXiv.2409.16502.

31. Zhou S. et al. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 21676–21685, http://openaccess.thecvf.com/content/CVPR2024/html/Zhou_Feature_3DGS_Supercharging_3D_Gaussian_Splatting_to_Enable_Distilled_Feature_CVPR_2024_paper.html.

32. Liu H.-T.D., Williams F., Jacobson A., Fidler S., and Litany O. Special Interest Group on Computer Graphics and Interactive Techniques Conference Proceedings, Vancouver BC Canada: ACM, Aug. 2022, pp. 1–13, DOI: 10.1145/3528233.3530713.

33. Shavit Y., Ferens R., and Keller Y. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2733–2742, http://openaccess.thecvf.com/content/ICCV2021/html/Shavit_Learning_Multi-Scene_Absolute_Pose_Regression_With_Transformers_ICCV_2021_paper.html.


Review

For citations:


Mohrat M., Sidorov G.K., Gridusov D.D., Kolyubin S.A. Grounding Keypoint Descriptors into 3D-Gaussian Splatting for Visual Localization in Dynamic Indoor/Outdoor Environments. Journal of Instrument Engineering. 2025;68(9):781-791. (In Russ.) https://doi.org/10.17586/0021-3454-2025-68-9-781-791

Views: 47


ISSN 0021-3454 (Print)
ISSN 2500-0381 (Online)