CS180 Project 2 — Filters, Hybrid Images, and Multi-Resolution Blending

Filters, Hybrid Images, and Multi-Resolution Blending

By Kourosh Salahi · CS180: Computer Vision & Computational Photography

Part 1.1 — Convolution

This section implements convolution from scratch with both four nested loops and a more efficient two-loop version. Results are compared to scipy.signal.convolve2d in terms of runtime and edge handling.

def conv4loops(image, kernel):
    height, width = image.shape
    kh, kw = kernel.shape
    pad_h, pad_w = kh // 2, kw // 2
    padded = np.pad(image, ((pad_h, pad_h), (pad_w, pad_w)), mode="constant")
    output = np.zeros_like(image)
    for i in range(height):
        for j in range(width):
            val = 0
            for m in range(kh):
                for n in range(kw):
                    val += kernel[m, n] * padded[i+m, j+n]
            output[i, j] = val
    return output

def conv2loops(image, kernel):
    height, width = image.shape
    kh, kw = kernel.shape
    pad_h, pad_w = kh // 2, kw // 2
    padded = np.pad(image, ((pad_h, pad_h), (pad_w, pad_w)), mode="constant")
    output = np.zeros_like(image)
    for i in range(height):
            for j in range(width):
                region = padded[i:i+kh, j:j+kw]
                output[i, j] = np.sum(region * kernel)
    return output

Original image (grayscale self-portrait)

Runtime: the four loop convolution was much slower, taking 1 minute and 4 seconds to run, compared to the 2 loop convolution only taking 10 seconds. These were both much slower than the built in method, convolve2d, which took 0.3 seconds

Boundaries: boundaries are handled with zero padding in my implementation, and optional, changeable padding in the scipy version.

Part 1.2 — Derivatives & Edges

Finite difference filters are applied to the Cameraman image to extract partial derivatives. I found that the best threshold was 0.32. When I went to high above this, I would lose out on the thickness of the edges. When i went below this, I would get more noise. The value 0.32 allowed me to get most of the cameraman's edges while only getting a little bit of the background included

Part 1.3 — Gaussian & DoG

Gaussian smoothing reduces noise before edge detection. Derivative of Gaussian (DoG) filters combine smoothing and differentiation into one operation, producing cleaner results compared to plain finite difference.

Observation: DoG/blurred edges are thicker, smoother, and less noisy compared to plain finite difference. The background edges are more easily picked up. I am overall able to get a cleaner, nicer edge image through the DoG/blurred image as opposed to the finite difference image.

Verification: Both Edge images are the same as found by computing their difference in the code and finding that it is negligible.

Part 2.1 — Unsharp Mask

Unsharp masking works by first creating a blurred version of the original image to capture its low-frequency content, subtracting this blurred version from the original to isolate the high-frequency details such as edges and fine textures, and then adding these extracted high frequencies back to the original image, often scaled by a chosen factor, to produce a result that appears sharper and more defined to the human eye.

Observation: Sharpening improves details but can amplify noise if over-applied. Also makes the images more vibrant

Key Point to notice: more sharpening on the taj mahal not only makes the image sharper, but also more vibrant/saturated

Part 2.2 — Hybrid Images

Hybrid images mix low frequencies from one source and high frequencies from another. The perception changes depending on viewing distance. To see the effect best: click on the hybrid image of your choice, then look at it from very close, and then look at it from very far. You should see two different images.

Observation: At close distance, high-frequency details dominate. At far distance, low-frequency structures dominate.

Part 2.3 & 2.4 — Multi-resolution Blending

Gaussian and Laplacian stacks allow smooth blending, avoiding harsh seams. Here I recreate the classic “Oraple” and experiment with custom blends.

Laplacian stack visualization for “Dreekend”

Observation: The golface image has a circular mask. Notice that it is more than just transparency on the face and pasting that over the golf ball, the ball blends into the face, and you can see parts of the golf ball divots fade away as the face becomes more prominent. The face is rather light, which was my attempt to match the shades. I would assume that implementing color matching would make the blend much prettier.

Observation: The Weeknd and Drake both make music that is enjoyable together. Their faces are not enjoyable together. The way that their mouths, hairlines, noses, and forehead wrinkles line up make the image rather eery. Notice that the sharp line above drake is not a seam, but a remant of the background he stands in front of.

Conclusion

This project explored convolution, edge detection, Gaussian and DoG filtering, unsharp masking, hybrid images, and multi-resolution blending. Along the way, I learned how different frequency components contribute to visual perception and how computational photography techniques can be combined to produce compelling and sometimes surprising results.

🍎 + 🍊 Thanks for checking out my project! I appreciate you taking the time to explore my work and I hope you found the results as fun and fascinating as I did while building them! -Kourosh Salahi