vismatch.utils
- vismatch.utils.disable_xformers()[source][source]
Disable xformers in all loaded modules, so that models fall back to standard PyTorch attention.
This is needed on CPU because xformers only supports CUDA. Without this, models using DINOv2 (e.g. RoMa, DeDoDe-Kornia) crash on CPU when xformers happens to be installed.
- vismatch.utils.get_image_pairs_paths(inputs)[source][source]
process input to produce a list of image pairs paths
- Parameters:
inputs (list[Path] | Path) – input path, which could be one of: (1) two image paths (2) dir with two images (3) dir with dirs with image pairs (4) txt file with two image paths per line
- Returns:
list of pairs of image paths
- Return type:
list[tuple[Path, Path]]
- vismatch.utils.to_numpy(x)[source][source]
convert item or container of items to numpy
- Parameters:
x (torch.Tensor | np.ndarray | dict | list) – input
- Returns:
numpy array of input
- Return type:
np.ndarray
- vismatch.utils.to_tensor(x, device=None)[source][source]
Convert to tensor and place on device
- Parameters:
x (np.ndarray | torch.Tensor) – item to convert to tensor
device (str, optional) – device to place tensor on. Defaults to None.
- Returns:
tensor with data from x on device device
- Return type:
torch.Tensor
- vismatch.utils.to_device(data, device='cuda')[source][source]
Recursively move tensors in nested data structures to device.
- Parameters:
data (Tensor | dict | list)
device (str)
- vismatch.utils.to_normalized_coords(pts, height, width)[source][source]
normalize kpt coords from px space to [0,1] Assumes pts are in x, y order in array/tensor shape (N, 2)
- Parameters:
pts (np.ndarray | torch.Tensor) – array of kpts, must be shape (N, 2)
height (int) – height of img
width (int) – width of img
- Returns:
kpts in normalized [0,1] coords
- Return type:
np.array
- vismatch.utils.to_px_coords(pts, height, width)[source][source]
unnormalized kpt coords from [0,1] to px space Assumes pts are in x, y order
- Parameters:
pts (np.ndarray | torch.Tensor) – array of kpts, must be shape (N, 2)
height (int) – height of img
width (int) – width of img
- Returns:
kpts in normalized [0,1] coords
- Return type:
np.array
- vismatch.utils.pad_images_to_same_shape(img0, img1)[source][source]
Pad two image tensors to the same spatial dimensions (right/bottom zero-padding).
- Parameters:
img0 (Tensor)
img1 (Tensor)
- Return type:
tuple[Tensor, Tensor]
- vismatch.utils.resize_to_divisible(img, divisible_by=14)[source][source]
Resize to be divisible by a factor. Useful for ViT based models.
- Parameters:
img (torch.Tensor) – img as tensor, in (*, H, W) order
divisible_by (int, optional) – factor to make sure img is divisible by. Defaults to 14.
- Returns:
img tensor with divisible shape
- Return type:
torch.Tensor
- vismatch.utils.lower_config(yacs_cfg)[source][source]
Convert yacs config to lower-case dict recursively.
- Parameters:
yacs_cfg (CfgNode)
- Return type:
dict
- vismatch.utils.load_module(module_name, module_path)[source][source]
Load module from module_path into the interpreter with the namespace given by module_name.
Note that module_path is usually the path to an __init__.py file.
- Parameters:
module_name (str) – module name (will be used to import from later, as in from module_name import my_function)
module_path (Path | str) – path to module (usually an __init__.py file)
- Return type:
None
- vismatch.utils.add_to_path(path, **_kwargs)[source][source]
Add path to the front of
sys.path, allowing imports from it.Always inserts at position 0 so the most recently added directory wins. Auto-detects every package and module in path and, if any of them are already cached in
sys.modulesfrom a different vismatch third-party directory, flushes the stale entries so the next import resolves correctly. User code, stdlib, and pip packages are never touched.- Parameters:
path (str | Path)
- Return type:
None
- vismatch.utils.get_default_device()[source][source]
get best available device for torch: cuda, mps (mac), else cpu
- Returns:
best available device as str
- Return type:
str
- vismatch.utils.flow_to_matches(flow, covisibility, num_samples=1000, min_confidence=0.0, method='probabilistic', rng=None)[source][source]
Convert a dense optical flow + covisibility map to sparse keypoint matches.
- Parameters:
flow (np.ndarray) – shape (2, H, W) or (H, W, 2). Interpreted as (dx, dy) per pixel.
covisibility (np.ndarray) – shape (H, W) with confidence in [0, 1] (or any non-negative scores).
num_samples (int, optional) – max number of matches to return. Defaults to 1000.
min_confidence (float, optional) – ignore pixels with covisibility <= min_confidence. Defaults to 0.0.
method (str, optional) – sampling method, one of “probabilistic”, “topk”, or “grid”. Defaults to “probabilistic”.
rng (np.random.RandomState | np.random.Generator, optional) – for reproducibility. Defaults to None.
- Returns:
- (matches0, matches1, confidences) where:
matches0 (np.ndarray): (N, 2) source keypoints as (x, y) (float32)
matches1 (np.ndarray): (N, 2) target keypoints as (x, y) = source + flow (float32)
confidences (np.ndarray): (N,) covisibility/confidence values (float32)
- Return type:
tuple