FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent - Dria

Join the Network

Created at 11pm, Apr 30

Software Development

0

FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent

5ndhYTmKqfcd74f0VdwsCrqRHw3g36hUxyB8jisKeZc

File Type

PDF

Entry Count

103

Embed. Model

jina_embeddings_v2_base_en

Index Type

hnsw

This paper introduces FlowMap, an end-to-end differentiablemethod that solves for precise camera poses, camera intrinsics, and perframe dense depth of a video sequence. Our method performs per-videogradient-descent minimization of a simple least-squares objective thatcompares the optical flow induced by depth, intrinsics, and poses againstcorrespondences obtained via off-the-shelf optical flow and point tracking. Alongside the use of point tracks to encourage long-term geometricconsistency, we introduce differentiable re-parameterizations of depth,intrinsics, and pose that are amenable to first-order optimization. Weempirically show that camera parameters and dense depth recovered byour method enable photo-realistic novel view synthesis on 360◦trajectories using Gaussian Splatting. Our method not only far outperforms priorgradient-descent based bundle adjustment methods, but surprisingly performs on par with COLMAP, the state-of-the-art SfM method, on thedownstream task of 360◦ novel view synthesis—even though our methodis purely gradient-descent based, fully differentiable, and presents a complete departure from conventional SfM.

5. Campos, C., Elvira, R., Rodrguez, J.J.G., M. Montiel, J.M., D. Tards, J.: Orbslam3: An accurate open-source library for visual, visualinertial, and multimap slam. Transactions on Robotics (6), 18741890 (2021) 3 6. Chan, E.R., Nagano, K., Chan, M.A., Bergman, A.W., Park, J.J., Levy, A., Aittala, M., De Mello, S., Karras, T., Wetzstein, G.: Generative novel view synthesis with 3d-aware diffusion models. Proceedings of the International Conference on 3D Vision (3DV) (2023) 12 7. Charatan, D., Li, S., Tagliasacchi, A., Sitzmann, V.: pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2023) 14 8. Chen, Y., Lee, G.H.: Dbarf: Deep bundle-adjusting generalizable neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2434 (June 2023) 4

id: 54b8c0e44ac50d9fc34ea0c31022c562 - page: 15

9. Cheng, Z., Esteves, C., Jampani, V., Kar, A., Maji, S., Makadia, A.: Lu-nerf: Scene and pose estimation by synchronizing local unposed nerfs. arXiv preprint arXiv:2306.05410 (2023) 4 10. Chng, S.F., Ramasinghe, S., Sherrah, J., Lucey, S.: Garf: gaussian activated radiance fields for high fidelity reconstruction and pose estimation. arXiv e-prints pp. arXiv2204 (2022) 4 11. Choy, C., Dong, W., Koltun, V.: Deep global registration. In: Proc. CVPR (2020) 7 12. Choy, C.B., Gwak, J., Savarese, S., Chandraker, M.: Universal correspondence network. Advances in neural information processing systems 29 (2016) 3 13. Clark, R., Bloesch, M., Czarnowski, J., Leutenegger, S., Davison, A.J.: Learning to solve nonlinear least squares for monocular stereo. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 284299 (2018) 3

id: 5c94a50aad868862c84f9d194e57312e - page: 15

14. Czarnowski, J., Laidlow, T., Clark, R., Davison, A.J.: Deepfactors: Real-time probabilistic dense monocular SLAM. Computing Research Repository (CoRR) (2020) 3 15. Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised NeRF: Fewer views and faster training for free. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2022) 4 16. DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 224236 (2018) 1 15 16

id: 7529361e8f80852774a2d525e0e936cc - page: 15

C. Smith and D. Charatan et al. 17. Doersch, C., Yang, Y., Vecerik, M., Gokay, D., Gupta, A., Aytar, Y., Carreira, J., Zisserman, A.: Tapir: Tracking any point with per-frame initialization and temporal refinement. arXiv preprint arXiv:2306.08637 (2023) 3 18. Du, Y., Smith, C., Tewari, A., Sitzmann, V.: Learning to render novel views from wide-baseline stereo pairs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2023) 14 19. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE transactions on pattern analysis and machine intelligence 40(3), 611625 (2017) 3 20. Engel, J., Schps, T., Cremers, D.: Lsd-slam: Large-scale direct monocular slam. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 834849. Springer (2014) 3 21. Fu, H., Yu, X., Li, L., Zhang, L.: Cbarf: Cascaded bundle-adjusting neural radiance fields from imperfect camera poses (2023) 4

id: 4d8efff522de5a125cfa124e4442b08d - page: 16

How to Retrieve?

# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "5ndhYTmKqfcd74f0VdwsCrqRHw3g36hUxyB8jisKeZc", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "5ndhYTmKqfcd74f0VdwsCrqRHw3g36hUxyB8jisKeZc", "level": 2}'