Project 4: Image Mosaics

In this project, we image to describe the process of how you can shoot different images and stitch them together to create a panorama:

Auto Stitched

This project was done in two parts. The first part is about how to stitch images together given a set of corresponding keypoints between two images. The second part is about how do you actually find these two corresponding keypoints automatically, enabling auto mosaics.

Part 1: Image Warping and Mosaicing

Shoot the Pictures

The first step of course it to shoot some photos. Here The most common way is to fix the center of projection (COP) and rotate your camera while capturing photos.We choose one image as the center image and warp all other images onto its perspective.. Here is some of the photos I took:

Images from Vally Life Science Library Dinosaurs.

Center Image

Images of the “Osborne” Trex in Vally Life Science Library:

Center Image

Images of the University of Berkeley Library:

Selecting Keypoints

After capturing your images, begin selecting corresponding keypoints between two images so that you know how to align one another (we’ll automatic this process in part 2).

Recovering Homographies

To begin warping an image onto another’s image perspective, we first have to align each image to the center image. We do this by recovering a homography matrix HH that maps one image’s perspective onto another’s. Given corresponding points p1=(x,y,1)\bm{p}_1 = (x, y, 1) in the first image and p2=(x,y,1)\bm{p}_2 = (x', y', 1) in the second, we seek the transformation HH such that

Hp1=wp2H \bm{p}_1 = w \bm{p}_2

where HH is a 3×33 \times 3 homography matrix, and ww is a scalar accounting for homogeneous coordinates. Expanding HH as:

H=[abcdefgh1]H = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{bmatrix}

we derive the following equations:

ax+by+c=wxdx+ey+f=wygx+hy+1=w\begin{aligned} ax + by + c &= w x' \\ dx + ey + f &= w y' \\ gx + hy + 1 &= w \end{aligned}

By eliminating ww, we express the equations as:

ax+by+cgxxhyx=xdx+ey+fgxyhyy=y\begin{aligned} ax + by + c - g x x' - h y x' &= x' \\ dx + ey + f - g x y' - h y y' &= y' \end{aligned}

These equations can be rewritten as a linear system:

[xy1000xxyx000xy1xyyy][abcdefgh]=[xy]\begin{bmatrix} x & y & 1 & 0 & 0 & 0 & -x x' & -y x' \\ 0 & 0 & 0 & x & y & 1 & -x y' & -y y' \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{bmatrix} = \begin{bmatrix} x' \\ y' \end{bmatrix}

Given multiple point correspondences, we can stack these equations into a larger linear system Ah=bA \bm{h} = \bm{b}, where AA is a 2n×82n \times 8 matrix of point coordinates, and h\bm{h} is the vector of homography parameters [a,b,c,d,e,f,g,h]T[a, b, c, d, e, f, g, h]^T. We solve for h\bm{h} using least squares, and reshape it into the 3×33 \times 3 matrix HH, with the bottom-right value set to 1.

Warping Images

The goal of image warping is to transform the input image based on the computed homography, aligning it with a common reference frame or another image. In the warp_image function, the image is warped using the given homography matrix HH to a specified output_shape.

The process begins by determining the bounding box for the entire mosaic and specifying an output_shape, which defines the target size for the warped image. A grid of target pixel locations is then created to cover this shape. Using inverse warping, each point in the target image is mapped back to its corresponding location in the source image under HH. Finally, a validity mask is created to ensure that only pixels within the bounds of the original image contribute to the warped result. Bilinear interpolation is applied to smoothly handle non-integer mappings of source coordinates and now the image is ready to be blended into the mosaic.

Blending Images into a Mosaic

Once each image is warped onto the common reference frame, the next step is to blend them into a single, cohesive panorama. Simply overlaying images would produce visible seams and sharp transitions in overlapping areas, so we apply a blending technique to achieve smooth transitions.

Using OpenCV’s cv2.distanceTransform, we compute a weight map for each image’s mask. This distance transform assigns higher weights to pixels near the image center and gradually lowers them toward the edges. By applying these weight maps to each image, overlapping areas are smoothly blended, minimizing visible seams.

The process involves creating a large canvas (mosaic) to accommodate all warped images. Each warped image is added to this canvas with its respective weight map.

Image Rectification

Image rectification was used as a test case to verify the functionality of the warp_image function. The objective was to transform a known planar object in an image—such as a book or poster—into a perfect rectangle using a homography. By defining corresponding points between the corners of the object and a rectangular target frame, we computed a homography matrix HH to perform this transformation.

This was the result:

Part 2: Auto-stitching

In Part 1, we manually selected feature correspondences. Here, we implement an automated approach for stitching images based on Brown et al.'s “Multi-Image Matching using Multi-Scale Oriented Patches” paper, with some simplifications. Here is the we used to auto stitch images together:

Detect Corner Features:

In order to find a potential feature point in each image, we use the Harris detector, which capturing areas of significant intensity variation (corners).

Below is an example of the Harris interest points for the Dinosaur Image:

Select Robust Keypoints with ANMS:

As seen from the example above, there could be many potential feature points given by the Harris detector. To reduce this number of points, we use Adaptive Non-Maximal Suppression (ANMS) to select the most distinctive points. This selection ensures that keypoints are well-distributed across the image.

Extract Feature Descriptors:

For each keypoints selected by ANMS, we generate a feature descriptor that captures the local image structure:

Match Feature Descriptors:

We compare descriptors between images by measuring Euclidean distance, identifying pairs of matching points between images:

Estimate Homography with RANSAC:

Using matched points, we compute a robust homography matrix with RANSAC:

  1. Blend and Stitch Images: Using the homography, we use the same process in the previous part to blend images together.

Results

The auto-stitching pipeline allows us to automatically create image mosaics. Below are examples of auto-stitched panoramas:

Auto Stitched

Manual Keypoint selection

Coolest Thing I Learned From This Project

The coolest thing I learned from this project would be how to compute homographies and warp images onto another image perspective. This means that we can trick the viewer into thinking the camera was in another place than it actually was, as seen by the rectified images.