Lecture 23: Bioimaging

CBIO (CSCI) 4835/6835: Introduction to Computational Biology

Overview and Objectives

Today, we'll wrap up our module on image processing with some more in-depth examples of how to analyze biological images. By the end of this lecture, you should be able:

  • Explain convolutional filters and how they perform operations such as blurring, sharpening, and edge finding
  • Implement erosion and dilation and define them in the context of image processing
  • Provide an end-to-end example of how to segment an image (for example: count cells)

Part 1: Filters

What does "filter" mean to you?

Instagram?

Photoshop?

Both are technically "yes", as they use the same underlying principle.

A filter applies a convolution kernel to an image.

Although this sounds fancy, this is just a generalization of a very simple idea: rather than applying a specific function to one pixel, this uses the pixel's surrounding neighborhood to apply a function.

The kernel is represented by an $n$x$n$ matrix where the target pixel is in the center (so $n$ needs to be an odd number).

The output of the filter is the sum of the products of the matrix elements with the corresponding pixels they overlap.

Confused?

Here's what it looks like. The image is the bigger $5 \times 5$ matrix; the filter is the smaller $3 \times 3$ matrix:

$$ \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} $$

Convolutional Kernels

A convolution is basically a multiplication: the smaller matrix is multiplied element-wise with the overlapped entries of the larger matrix. These products are then all summed together into a new value for the pixel at the very center of the filter.

Then the filter is moved and the process repeats. This is true for any and all filters (or kernels).

What creates the specific effect, then--edge finding, blurring, sharpening, and so on--is the specific numbers in the filter. As such, there are some common filters:

IdentityBlurEdge Detection

Wikipedia has a whole page#Details) on common convolutional filters:

Of course, you don't have to design the filter and code up the convolution yourself (though you could!). Most image processing packages have default versions of these filters included.

In [2]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from PIL import Image
from PIL import ImageFilter

img = Image.open("Lecture22/image1.png")
plt.imshow(img)
Out[2]:
<matplotlib.image.AxesImage at 0x10d51d4a8>

Some of the PIL built-in filters include:

  • BLUR
  • CONTOUR
  • DETAIL
  • EDGE_ENHANCE
  • EDGE_ENHANCE_MORE
  • EMBOSS
  • FIND_EDGES
  • SMOOTH
  • SMOOTH_MORE
  • SHARPEN

Here's BLUR:

In [4]:
import numpy as np

blurred = img.filter(ImageFilter.BLUR)
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(np.array(img))
f.add_subplot(1, 2, 2)
plt.imshow(np.array(blurred))
Out[4]:
<matplotlib.image.AxesImage at 0x10ddff0f0>

Here's SHARPEN:

In [5]:
sharpened = img.filter(ImageFilter.SHARPEN)
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(np.array(img))
f.add_subplot(1, 2, 2)
plt.imshow(np.array(sharpened))
Out[5]:
<matplotlib.image.AxesImage at 0x10e2e0f28>

And FIND_EDGES (more on this later):

In [7]:
edges = img.filter(ImageFilter.FIND_EDGES)
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(np.array(img))
f.add_subplot(1, 2, 2)
plt.imshow(np.array(edges))
Out[7]:
<matplotlib.image.AxesImage at 0x10da77550>

There are other filters that don't quite abide by the "multiply every corresponding element and then sum the products" rule of convolution.

In deep learning parlance, these filters are known as "pooling" operators, because while you still have a filter that slides over the image, you choose one of the pixel values within that filter instead of computing a function over all the pixel values.

As common examples, we have

  • Mean and median filters: picking the average pixel value or median pixel value in the filter range
  • Max and min filters: picking the maximum (max-pool) or minimum (min-pool) pixel value in the filter range

Here's max-pooling:

In [12]:
max_pool = img.filter(ImageFilter.MaxFilter(5))  # This means a 5x5 filter
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(np.array(img)) 
f.add_subplot(1, 2, 2)
plt.imshow(np.array(max_pool))
Out[12]:
<matplotlib.image.AxesImage at 0x10fd934e0>

(just to help things, let's convert to grayscale and try that again)

In [14]:
img = img.convert("L")

# Same code as before.
max_pool = img.filter(ImageFilter.MaxFilter(5))  # This means a 5x5 filter
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(np.array(img), cmap = "gray")
f.add_subplot(1, 2, 2)
plt.imshow(np.array(max_pool), cmap = "gray")
Out[14]:
<matplotlib.image.AxesImage at 0x110805e10>

How about median pooling?

In [15]:
median_pool = img.filter(ImageFilter.MedianFilter(5))  # This means a 5x5 filter
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(np.array(img), cmap = "gray")
f.add_subplot(1, 2, 2)
plt.imshow(np.array(median_pool), cmap = "gray")
Out[15]:
<matplotlib.image.AxesImage at 0x110917a20>

So...

What are the benefits of filtering and pooling? Why would something like a blur filter or a median filter be nice to use?

These filters clear out a lot of noise.

  • By considering neighborhoods of pixels, you're able to automatically dampen tiny flecks of fluorescence that don't correspond to anything, because they'll be surrounded by black.
  • On the other hand, filters like blur and median won't [strongly] affect large sources of fluorescence, since the neighborhoods of any given pixel will also be emitting lots of signal.

Protip: median filters are awesome for getting rid of tiny little specks of light you don't want, while maintaining sharp edges between objects (unlike Gaussian blur filters).

Part 2: Edges

Edge-finding is an important aspect of bioimaging and image processing in general.

Edges essentially denote "image derivatives" (and are essentially calculated as such!). Edges are formed by the image pixels changing suddenly.

Edges delineate objects. The question isn't necessarily where the edges of the object are, but which edges belong to the object you're interested in.

Canny Edge Detector

One of the most popular algorithms for finding edges in an image is the Canny Edge Detector.

The Canny edge detector works in three distinct phases (don't worry, you won't have to implement any of these):

1: It runs a filter (!) over the image, using the Gaussian formulation, to generate a filtered image. However, this Gaussian filter is a special variant--it actually represents the first derivative of a Gaussian. Running this filter over the image essentially computes the image derivatives at each pixel.

2: The filter generates a lot of "candidate" edges that have to be pruned down; the edges are reduced until they're only 1 pixel thick.

3: Finally, a threshold is applied: a pixel is considered part of an "edge" if its derivative (as computed in step 1) exceeds a certain value. The higher the threshold, the larger the pixel derivative has to be to be considered an edge.

In action, Canny edge detectors looks something like this:

In [17]:
import skimage.feature as feature

img = np.array(img)
canny_small = feature.canny(img, sigma = 1)
canny_large = feature.canny(img, sigma = 3)

f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 3, 1)
plt.imshow(img, cmap = "gray")
f.add_subplot(1, 3, 2)
plt.imshow(canny_small, cmap = "gray")
f.add_subplot(1, 3, 3)
plt.imshow(canny_large, cmap = "gray")
Out[17]:
<matplotlib.image.AxesImage at 0x102e63be0>
  • Note how many fewer edges there are with sigma = 3 than sigma = 1.
  • By making this sigma larger, you're upweighting neighboring pixels in the Gaussian derivative filter.
  • This makes it harder for the middle pixel (the one under consideration as an edge, or not) to exceed the "threshold" from step 3 to be considered an edge.
  • The sigma argument essentially equates to the "width" of the filter--smaller width, smaller neighborhood; therefore, more representation by the middle pixel.

Erosion and Dilation

Another common noise-reducing step is some combination of erosion and dilation.

These don't find edges per se, but rather they modify the edges.

Erosion will move along objects and erode them by 1 pixel (or more).

This has the benefit of utterly wiping out objects that are very small--these are usually noise anyway.

Dilation is the inverse operation: it will move along objects, padding their edges by 1 pixel (or more).

This has the effect of smoothing object edges--jagged edges are filled in.

These two effects are often used in tandem to smooth out potentially rough images.

In [21]:
import skimage.morphology as morphology

de1 = morphology.dilation(morphology.erosion(img))
de2 = morphology.dilation(morphology.erosion(de1))

f = plt.figure(figsize = (18, 9))
f.add_subplot(1, 3, 1)
plt.imshow(img, cmap = "gray")
f.add_subplot(1, 3, 2)
plt.imshow(de1, cmap = "gray")
f.add_subplot(1, 3, 3)
plt.imshow(de2, cmap = "gray")
Out[21]:
<matplotlib.image.AxesImage at 0x113c55390>

Hmm, not an obvious effect in the full grayscale.

Let's try again, taking a page from yesterday's lecture--specifically, let's use a single channel and threshold the image.

In [29]:
import scipy.ndimage as ndimg

img = ndimg.imread("Lecture22/image1.png")
hsp = img[:, :, 1]  # Green channel - heat shock protein
hsp_bin = hsp > np.mean(hsp)  # Median threshold.
In [30]:
# Now we'll do the same code as before.
de1 = morphology.binary_dilation(morphology.binary_erosion(hsp_bin))
de2 = morphology.binary_dilation(morphology.binary_erosion(de1))

f = plt.figure(figsize = (18, 9))
f.add_subplot(1, 3, 1)
plt.imshow(hsp_bin, cmap = "gray")
f.add_subplot(1, 3, 2)
plt.imshow(de1, cmap = "gray")
f.add_subplot(1, 3, 3)
plt.imshow(de2, cmap = "gray")
Out[30]:
<matplotlib.image.AxesImage at 0x115612b38>

Skeletonization

You can go a step further. If you're not interested in the overall shape of the objects, but just want some measure of how many and where, you can use skeletonization.

This essentially performs the erosion operation over and over until each object is only 1 pixel wide.

In [31]:
skeleton = morphology.skeletonize(hsp_bin)

f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(hsp_bin, cmap = "gray")
f.add_subplot(1, 2, 2)
plt.imshow(skeleton, cmap = "gray")
Out[31]:
<matplotlib.image.AxesImage at 0x115a09668>

What went wrong, do you think? Why didn't we just get 1-2 lines per object?

We should try to eliminate all those specks!

In [37]:
import skimage.filters as filters

bin_median = filters.median(hsp_bin, morphology.square(5))
bin_median[bin_median == 255] = 1
skeleton = morphology.skeletonize(bin_median)

f = plt.figure(figsize = (18, 9))
f.add_subplot(1, 3, 1)
plt.imshow(hsp_bin, cmap = "gray")
f.add_subplot(1, 3, 2)
plt.imshow(bin_median, cmap = "gray")
f.add_subplot(1, 3, 3)
plt.imshow(skeleton, cmap = "gray")
Out[37]:
<matplotlib.image.AxesImage at 0x1145ede80>

Convex Hull

The inverse operation of skeletonization is the to find the convex hull of objects in the image. This is basically a fancy way of saying:

  • Draw the simplest $n$-gon you can to encompass the entire object.

This is a very good way of deriving masks for complex!

Let's give ourselves a bit of a head start by using that binarized median filtered image:

In [39]:
convex_hulls = morphology.convex_hull_object(bin_median)

f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(bin_median, cmap = "gray")
f.add_subplot(1, 2, 2)
plt.imshow(convex_hulls, cmap = "gray")
Out[39]:
<matplotlib.image.AxesImage at 0x1161060b8>

We can overlay the two plots to show how the convex hulls were drawn:

In [42]:
import skimage
img_float = skimage.img_as_float(convex_hulls.copy())
img_float[bin_median == 1] = 2

plt.imshow(img_float, cmap = "gray")
Out[42]:
<matplotlib.image.AxesImage at 0x1173eecc0>

Part 3: Segmentation

Ok--we've seen so far how to identify objects (kinda) in images. It'd be great to actually get their coordinates, extract them, and learn something about their shape and size and general morphology.

This is the process of segmentation: explicitly developing a mask that pulls out the objects you're interested in.

The convex hulls are a good start, but that one blob near the middle combined a bunch of the cells.

(yes, we're purposely hobbling ourselves by not using the DAPI channel; for illustration purposes)

We can employ a specific segmentation to try and break that blob apart into multiple objects, ideally correpsonding with the blobs we see in the grayscale image.

Watershed Segmentation

This starts with the idea of treating your image as a contour map, where pixel intensity indicates elevation.

From the "lowest" parts of your image, you start filling your image with water.

Where water from each "basin" meets, that's a segmentation boundary.

Continue raising the water level of your image until you have fully segmented it.

This will give you an image with boundaries between all the objects.

"Wait a minute," you start, "that looks great and all, but how do you know where to start adding the water?"

Excellent question. Ideally, we'd like to place a "basin" (or "seed") inside each object. But that requires knowing where the objects are ahead of time, which kind of obviates the need for watershed in the first place.

Well, I kinda gave it away earlier--we're looking for pixels that have values at the extremes. These are known as local maxima.

We'd like to think that each object has a "center" that is bright; ideally, it's the brightest part of the object.

This isn't always the case, but we can kind of force it using a distance map. If we compute the distance of any point inside an object to its edge, the center should be the farthest, right?

Distance Maps

Let's first see about computing the distance map for our binarized median filtered image.

In [52]:
distmap = ndimg.distance_transform_edt(bin_median)

f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(bin_median, cmap = "gray")
f.add_subplot(1, 2, 2)
plt.imshow(distmap, cmap = "gray")
Out[52]:
<matplotlib.image.AxesImage at 0x1133fd0f0>

(yep, you could probably use skeletonization to achieve the same effect!)

Now, we'll find the coordinates of the local maxima and use those as seeds for watershed.

In [64]:
maxima = feature.peak_local_max(distmap, indices = False, min_distance = 2, labels = bin_median)

plt.imshow(maxima, cmap = "gray")
Out[64]:
<matplotlib.image.AxesImage at 0x119ceb198>

Now we'll use these coordinates as our starting points for watershed!

In [68]:
markers = ndimg.label(maxima)[0]
labels = morphology.watershed(-distmap, markers, mask = bin_median)

plt.imshow(labels, cmap = "nipy_spectral_r")
Out[68]:
<matplotlib.image.AxesImage at 0x116e918d0>

It's certainly not perfect.

There are a lot of improvements that could be made, but this was just to illustrate how to get segmented regions.

Once you have these regions, you can ask a lot of interesting questions about them:

In [77]:
import skimage.measure as measure

regions = measure.regionprops(labels)
print(len(regions))  # 28 regions, each with some of the following properties:
28
In [81]:
region1 = labels.copy()
region1[region1 != regions[1].label] = 0
plt.imshow(region1, cmap = "gray")

print(regions[1].area)
7601
In [85]:
print(regions[1].centroid)
(92.791080121036714, 79.788975134850688)
In [84]:
print(regions[1].major_axis_length)
print(regions[1].minor_axis_length)
104.25411574839332
93.36852033227652
In [86]:
print(regions[1].solidity)
0.933210558625

Administrivia

  • Next week: visualization using matplotlib (finally), and Open Science
  • Final Project proposals due tomorrow at 11:59pm!
  • No office hours tomorrow, as I will be in a conference in Atlanta all day.

Additional Resources