Why this guide?
Even state-of-the-art methods (Gaussian Splatting, NeRF) are only as good as your poses and geometry. That means: disciplined capture and sensible COLMAP[1] settings. Below is a practical, professional workflow that consistently yields sharper point clouds, stable cameras, and faster training.
1) Capture fundamentals (what actually matters)
Texture & features
SfM needs repeatable features. Add texture to bland surfaces.
- Add anchors: posters, checkerboards, ArUco/AprilTags, textured fabric, books.
- Avoid large glossy areas; break them up with matte artifacts or cross-polarization (if available).
- Ensure every major surface appears in ≥ 3 views with good parallax.
Lighting & exposure
Aim for soft, uniform light.
- Prefer diffuse (cloudy daylight; bounced/softbox). Avoid harsh spotlights and specular glare.
- Lock exposure & white balance; disable auto-ISO/HDR.
- Shoot RAW (or minimally compressed) whenever possible.
Focus & optics
- Manual focus, fixed for the entire sequence.
- Disable aggressive EIS/IBIS modes that warp geometry.
- If using wide/fisheye lenses, note the camera model choice in COLMAP (see below).
Motion planning & overlap
- Move the camera center—don’t just rotate in place.
- Overlap per view: 50–70%; keep baseline/parallax modest but non-zero.
- Typical paths: ring + elevated ring, figure-8, orbital arcs with a few top-down passes.
Exposure triangle (typical targets)
- ISO: 64–400 (lower is better).
- Shutter: ≤ 1/100 s handheld; faster for moving subjects.
- Aperture: f/5.6–f/8 for sharpness (if available).
Quick pre-shoot checklist
- 4K+ resolution, RAW if possible.
- Lock: WB, exposure, ISO, focus.
- Add texture anchors where needed.
- Plan a path ensuring parallax + coverage.
2) Preparing media (stills & video)
Stills
- Keep burst rates sensible; avoid dozens of near-duplicates.
Video
- Extract frames at 1–5 FPS depending on motion speed.
- Remove motion-blurred or heavily redundant frames.
3) COLMAP: settings that move the needle
Below are the practical switches that most affect robustness and accuracy. Start with the Balanced preset, then adapt for low texture, videos, or wide lenses.
Image import (camera model & shared intrinsics)
If your capture used fixed zoom/focus, share the same camera across images:
-
--ImageReader.single_camera 1
(enforces identical intrinsics) -
Camera model:
- SIMPLE_RADIAL (default) for normal lenses.
- OPENCV for wide lenses with tangential distortion.
- OPENCV_FISHEYE for fisheye action cams.
Example (project .ini or CLI flags used by multiple commands)
ImageReader.camera_model = OPENCVImageReader.single_camera = 1# Optional: if EXIF is missing and you know the focal in pixels:# ImageReader.camera_params = fx,fy,cx,cy,k1,k2,p1,p2,k3
A) Feature extraction (feature_extractor
)
Key controls (SIFT):
--SiftExtraction.max_image_size
: upsample to get more features (GPU/RAM permitting).--SiftExtraction.max_num_features
: cap per-image features.--SiftExtraction.peak_threshold
: lower → more keypoints (helps low texture).--SiftExtraction.use_gpu 1
: use GPU extractor.
Balanced (good light, normal texture)
colmap feature_extractor \ --database_path database.db \ --image_path images \ --ImageReader.single_camera 1 \ --SiftExtraction.use_gpu 1 \ --SiftExtraction.max_image_size 3200 \ --SiftExtraction.max_num_features 12000 \ --SiftExtraction.peak_threshold 0.004
Low-texture / matte walls
colmap feature_extractor \ --database_path database.db \ --image_path images \ --ImageReader.single_camera 1 \ --SiftExtraction.use_gpu 1 \ --SiftExtraction.max_image_size 4096 \ --SiftExtraction.max_num_features 16000 \ --SiftExtraction.peak_threshold 0.0035 \ --SiftExtraction.edge_threshold 10
Tip: if keypoints cluster at edges only, reduce
peak_threshold
a bit; if you get many spurious matches, increase it slightly.
B) Matching (pair selection + SIFT matching)
Pick a matcher:
- Exhaustive (
exhaustive_matcher
) — best for < 500–800 images. - Sequential (
sequential_matcher
) — best for video/ordered frames (with loop detection). - Vocab tree (
vocab_tree_matcher
) — best for thousands of images.
Enable GPU matching and guided verification:
# Exhaustivecolmap exhaustive_matcher \ --database_path database.db \ --SiftMatching.use_gpu 1 \ --SiftMatching.guided_matching 1
Sequential (video) with loop closure
colmap sequential_matcher \ --database_path database.db \ --SiftMatching.use_gpu 1 \ --SiftMatching.guided_matching 1 \ --SequentialMatching.overlap 5 \ --SequentialMatching.loop_detection 1 \ --SequentialMatching.loop_detection_num_images 50 \ --SequentialMatching.loop_detection_period 10
Increase
overlap
if you sparsified video frames aggressively.
C) Incremental SfM (mapper
)
Controls that affect stability/scale drift:
-
Initialization & inliers
--Mapper.init_min_num_inliers
(e.g., 200–300 for dense captures)--Mapper.abs_pose_min_num_inliers
(e.g., 30–60)
-
Triangulation & filtering
--Mapper.tri_min_angle 1.0
(larger → more robust, fewer points)--Mapper.filter_max_reproj_error 4.0
(tighter → cleaner points)
-
Bundle adjustment (intrinsics refinement)
- If you locked focus/zoom:
--Mapper.ba_refine_focal_length 0
--Mapper.ba_refine_principal_point 0
--Mapper.ba_refine_extra_params 0
- If you did not lock or EXIF is unreliable: set the above to 1.
- If you locked focus/zoom:
Balanced mapper example
mkdir -p sparsecolmap mapper \ --database_path database.db \ --image_path images \ --output_path sparse \ --Mapper.init_min_num_inliers 200 \ --Mapper.abs_pose_min_num_inliers 40 \ --Mapper.tri_min_angle 1.0 \ --Mapper.filter_max_reproj_error 4.0 \ --Mapper.ba_refine_focal_length 0 \ --Mapper.ba_refine_principal_point 0 \ --Mapper.ba_refine_extra_params 0
If registration stalls: run
exhaustive_matcher
again, lowerpeak_threshold
a touch, and retrymapper
.
D) Dense stage (undistort → MVS → fuse)
mkdir -p dense
# Undistort (export for MVS)colmap image_undistorter \ --image_path images \ --input_path sparse/0 \ --output_path dense \ --output_type COLMAP \ --max_image_size 2000
# PatchMatch stereo (robust depth with geometric consistency)colmap patch_match_stereo \ --workspace_path dense \ --workspace_format COLMAP \ --PatchMatchStereo.geom_consistency true
# Depth fusion → point cloudcolmap stereo_fusion \ --workspace_path dense \ --workspace_format COLMAP \ --input_type geometric \ --output_path dense/fused.ply
For very detailed scenes and ample VRAM, keep
max_image_size
higher in both extraction and undistortion.
E) Post: orientation & scale for NeRF/GS
For reproducible training and camera paths:
-
Orient to gravity / Manhattan:
Terminal window colmap model_orientation_aligner \--input_path sparse/0 \--output_path sparse_aligned \--method MANHATTAN -
Align to known axes / GPS / control points (optional):
Terminal window colmap model_aligner \--input_path sparse/0 \--output_path sparse_aligned \--ref_images_path ref.txt
Export cameras.txt / images.txt / points3D.txt
or fused.ply
for downstream pipelines.
4) One-page field checklist (print me)
- Before: RAW on, WB locked, exposure locked, ISO ≤ 400, manual focus, EIS/HDR off, add anchors.
- Path: ring + elevated ring + a few top-downs; 50–70% overlap; steady pace.
- Stills: no heavy bursts. Video: extract 1–5 FPS frames.
- COLMAP: extractor 3200–4096 px; 12–16k feats; peak 0.0035–0.004; GPU on; exhaustive or sequential+loops; mapper with tight reproj (3–4 px), refine intrinsics off if you truly locked optics.
- Dense: undistort 2000–3000 px;
geom_consistency true
; fuse; align orientation.
Bottom line: lock your camera, add texture, plan motion for parallax, and use the tuned COLMAP switches above. Your Gaussian Splatting/NeRF training will converge faster, with crisper detail and far fewer headaches.
Note: To use this guide on your business, contact us. Copyright by Kali Ink.
Important (References)
[1] Schönberger, J. L., & Frahm, J.-M. (2016). Structure-from-Motion revisited. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE