HRMT
Tutorial for Hull Reconstruction Motion Tracking (HRMT)
HRMT is a way to extract the coordinates that describe a two-winged insect from flight videos.
If you’re somewhat familiar with the HRMT implementation and experienced with MATLAB, you can simply run the code in the file HRMT.m and see what our program is capable of doing. You must first put the code file in the same directory as the image files xy1.bmp, xy2.bmp, etc.
Here is a
link to the HRMT code.
Note: This is a bare-bones, decently-commented version of the code. It runs only on a single frame of a high-speed movie, and we’ve included sample image files above. If this code seems appealing to you, then contact me, Leif Ristroph at lgr24@cornell.edu.
If you’re not familiar with HRMT, this tutorial will give you an overview of the technique. First, here’s the big picture: We want a way to go from high-speed movies of animals moving to (relatively few) coordinates that describe this motion. We don’t want to have to put markers or otherwise prepare animals before gathering movies. And we want a method that is as automated as possible. Specifically, our code is custom-designed to extract the body and wing motions of flying fruit flies from videos taken from three orthogonal high-speed cameras. In the end, we have 18 coordinates that describe the insect’s configuration for each frame of the movie: 3 center-of-mass coordinates and 3 Euler orientation angles for each the body, right wing, and left wing. We like to use these data to make 3D movies of the insects flying around, to figure out how insects do maneuvers, and to compute fluid forces for different wing motions.
If you’re not interested in this particular insect flight problem, you still might find our approach useful. Try modifying some of steps of this process to tailor the method to your motion tracking problem. Contact us and we’d be glad to try to help.
The HRMT process is broken down in the above diagram. Let’s step through the 4 core steps: 1) image processing, 2) visual hull reconstruction, 3) dissection by clustering, and 4) coordinate extraction.
Image processing: This step takes in raw .bmp image files and outputs images that look like silhouettes. The raw images from our videos look like shadows, because we back-light each camera. Thresholding makes the insect completely black and the background completely white. We then use the bounding box of the insect to align the images. This fixes slight (~ 5 pixels) misalignment of the 3 cameras.
Note: This step is very much tailored to our filming set-up. If your set-up is similar, you might just have to do simple adjustments such as changing the threshold parameters. If your set-up is very different, like non-orthogonal cameras, you will need more sophisticated photogrammetric techniques. Check out MATLAB’s camera calibration toolbox, for example.
Visual hull reconstruction: This step takes in the silhouette images and constructs a 3D representation of the insect. In computer graphics or computer vision, this reconstruction is called a visual hull. It is the largest volume shape that is consistent with the 3 images from the orthogonal views. You can think of the visual hull as being the shape of pastry you would get if you passed silhouette-shaped cookie-cutters through cookie dough. The actual data you end up with is simply a massive array (big x 3) of the pixel coordinates for all points in the hull. These 3D volume pixels are called voxels.
Note: Here, the more cameras, the better the reconstruction. The visual hull is always larger than the actual object because occluded or hidden regions are included in the hull. This leads to slight errors in the final coordinates also.
Dissection by clustering: This step dissects the visual hull into distinct groups of voxels corresponding to the body, right wing, and left wing. We use k-means clustering to find the groups, and this routine is built-in to MATLAB. Clustering simply finds groups of points that are near one another.
Note: The key trick for this step is picking the right number of groups. An insect has 3 parts of interest, the body, right wing, and left wing. But searching for 3 clusters does not work well. It ends up that searching for 4 clusters finds two body segments and a single cluster for each wing. We then merge the 2 body cluster to form a single body cluster. I think this works for the fruit fly but may not work in general. Try varying the number of clusters and see how your dissection looks.

Coordinate extraction: This step takes the voxel clusters corresponding to the body and each wing and returns the associated centroid position and orientation angles for each. So, this is the step that gives the coordinates of interest. We first estimate the center-of-mass of each component by finding the centroid coordinates. To do this, simply find the mean of each voxel grouping. Then, we use principal components analysis (PCA) and geometric information about the insect to get these coordinates. For the body: use PCA to find the long axis of the body and thus the yaw and pitch angles. For the roll vector of the body, we tried several things that all worked pretty poorly. If you figure out something let us know. For each wing: use PCA to find the wing span and thus the stroke and stroke deviation angles. To find the pitch orientation of the wing, look at a chord-wise slice of the reconstructed wing and note that it looks like a parallelepiped. The wing chord is then the diagonal of this wing cross-section. We find the diagonal by finding the voxels in this section that are furthest apart.
Note: We think that this array of techniques should work well in general, though different set-ups will require some tweaking. We like the methods of centroid-finding and PCA because they essentially average over all points in the hull, making the coordinate extraction rather insensitive to errors in the reconstruction.
Repeat: Run the code on all frames in the movie.
Animate, do science, etc.: You take it from here.