CS180 · Project 2

Filters, Hybrid Images, and Multi-Resolution Blending

By Kourosh Salahi · CS180: Computer Vision & Computational Photography

Part 1.1 — Convolution

This section implements convolution from scratch with both four nested loops and a more efficient two-loop version. Results are compared to scipy.signal.convolve2d in terms of runtime and edge handling.

def conv4loops(image, kernel):
    height, width = image.shape
    kh, kw = kernel.shape
    pad_h, pad_w = kh // 2, kw // 2
    padded = np.pad(image, ((pad_h, pad_h), (pad_w, pad_w)), mode="constant")
    output = np.zeros_like(image)
    for i in range(height):
        for j in range(width):
            val = 0
            for m in range(kh):
                for n in range(kw):
                    val += kernel[m, n] * padded[i+m, j+n]
            output[i, j] = val
    return output
def conv2loops(image, kernel):
    height, width = image.shape
    kh, kw = kernel.shape
    pad_h, pad_w = kh // 2, kw // 2
    padded = np.pad(image, ((pad_h, pad_h), (pad_w, pad_w)), mode="constant")
    output = np.zeros_like(image)
    for i in range(height):
            for j in range(width):
                region = padded[i:i+kh, j:j+kw]
                output[i, j] = np.sum(region * kernel)
    return output
Original image (grayscale self-portrait)
9×9 box filter result
Finite difference Dx
Finite difference Dy

Runtime: the four loop convolution was much slower, taking 1 minute and 4 seconds to run, compared to the 2 loop convolution only taking 10 seconds. These were both much slower than the built in method, convolve2d, which took 0.3 seconds

Boundaries: boundaries are handled with zero padding in my implementation, and optional, changeable padding in the scipy version.

Part 1.2 — Derivatives & Edges

Finite difference filters are applied to the Cameraman image to extract partial derivatives. I found that the best threshold was 0.32. When I went to high above this, I would lose out on the thickness of the edges. When i went below this, I would get more noise. The value 0.32 allowed me to get most of the cameraman's edges while only getting a little bit of the background included

Original Cameraman
Partial derivative Dx
Partial derivative Dy
Gradient magnitude
Binarized edge map

Part 1.3 — Gaussian & DoG

Gaussian smoothing reduces noise before edge detection. Derivative of Gaussian (DoG) filters combine smoothing and differentiation into one operation, producing cleaner results compared to plain finite difference.

DoG filter (x-direction)
DoG filter (y-direction)
Edges via blur first
Edges via DoG

Observation: DoG/blurred edges are thicker, smoother, and less noisy compared to plain finite difference. The background edges are more easily picked up. I am overall able to get a cleaner, nicer edge image through the DoG/blurred image as opposed to the finite difference image.

Verification: Both Edge images are the same as found by computing their difference in the code and finding that it is negligible.

Part 2.1 — Unsharp Mask

Unsharp masking works by first creating a blurred version of the original image to capture its low-frequency content, subtracting this blurred version from the original to isolate the high-frequency details such as edges and fine textures, and then adding these extracted high frequencies back to the original image, often scaled by a chosen factor, to produce a result that appears sharper and more defined to the human eye.

Original Taj Mahal
Blurred Taj
High frequencies
Sharpened Taj
Original Taj Mahal
Blurred Taj with more sharpening
High frequencies with more sharpening
Sharpened Taj with more sharpening
Original Dog
Blurred Dog
High frequencies
Sharpened Dog
Original Cat
Blurred Cat
High frequencies
Resharpened Cat

Observation: Sharpening improves details but can amplify noise if over-applied. Also makes the images more vibrant

Key Point to notice: more sharpening on the taj mahal not only makes the image sharper, but also more vibrant/saturated

Part 2.2 — Hybrid Images

Hybrid images mix low frequencies from one source and high frequencies from another. The perception changes depending on viewing distance. To see the effect best: click on the hybrid image of your choice, then look at it from very close, and then look at it from very far. You should see two different images.

Derek
Nutmeg
Derek + Nutmeg. Dutmeg, if you will.
LeBron
Denero
LeBron + Denero. Respective G.O.A.T.s
Gi-hun Happy
Gi-hun Mad
Gi-hun Expressions
Monkey Image 1
Monkey Image 2
Monkey Blend
Fourier Analysis of Hybrid Monkey

Observation: At close distance, high-frequency details dominate. At far distance, low-frequency structures dominate.

Part 2.3 & 2.4 — Multi-resolution Blending

Gaussian and Laplacian stacks allow smooth blending, avoiding harsh seams. Here I recreate the classic “Oraple” and experiment with custom blends.

Apple source
Orange source
Oraple
Apple stack
Orange stack
Reference Fig. 3.42
Golf image
Face image
Golf + Face blend. Golface
Weeknd source
Drake source
The Dreekend
Laplacian stack visualization for “Dreekend”

Observation: The golface image has a circular mask. Notice that it is more than just transparency on the face and pasting that over the golf ball, the ball blends into the face, and you can see parts of the golf ball divots fade away as the face becomes more prominent. The face is rather light, which was my attempt to match the shades. I would assume that implementing color matching would make the blend much prettier.

Observation: The Weeknd and Drake both make music that is enjoyable together. Their faces are not enjoyable together. The way that their mouths, hairlines, noses, and forehead wrinkles line up make the image rather eery. Notice that the sharp line above drake is not a seam, but a remant of the background he stands in front of.

Conclusion

This project explored convolution, edge detection, Gaussian and DoG filtering, unsharp masking, hybrid images, and multi-resolution blending. Along the way, I learned how different frequency components contribute to visual perception and how computational photography techniques can be combined to produce compelling and sometimes surprising results.

🍎 + 🍊 Thanks for checking out my project! I appreciate you taking the time to explore my work and I hope you found the results as fun and fascinating as I did while building them! -Kourosh Salahi