Dual-Shutter Optical Vibration Sensing | CVPR 2022 (Oral, Best Paper Honorable Mention)

Mark Sheinin, Dorian Chan, Matthew O'Toole, and Srinivas Narasimhan


09216.pdfpaper
Dual-Shutter Optical Vibration Sensing
 



"Seeing" sound in a novel way

We propose a novel dual-shutter approach for sensing surface vibrations at high speeds (up to 63kHz), for multiple scene sources at once, in a bandwidth-efficient way (using "slow" 130FPS cameras). We demonstrate our method by capturing vibration caused by audio sources, such as:

speakers

indirect human voice

multiple instruments

tuning fork vibration modes

























5min narrated overview video





















Abstract
Visual vibrometry is a highly useful tool for remote capture of audio, as well as the physical properties of materials, human heart rate, and more. While visually-observable vibrations can be captured directly with a high-speed camera, minute imperceptible object vibrations can be optically amplified by imaging the displacement of a speckle pattern, created by shining a laser beam on the vibrating surface. In this project, we propose a novel method for sensing vibrations at high speeds (up to 63kHz), for multiple scene sources at once, using sensors rated for only 130Hz operation. Our method relies on simultaneously capturing the scene with two cameras equipped with rolling and global shutter sensors, respectively. The rolling shutter camera captures distorted speckle images that relate to the high-speed object vibrations. The global shutter camera captures undistorted reference images of the speckle pattern, helping to decode the source vibrations. We demonstrate our method by capturing vibration caused by audio sources (e.g. speakers, human voice, and musical instruments) and analyzing the vibration modes of a tuning fork.

Why capture vibration optically?
You might be wondering: what is the advantage of capturing vibrations optically? Well, the selected experiments below aim to answer that question.

1) Simultaneous capture of multiple sound sources
In this experiment, we use our system to record the membrane vibrations of a set of two speakers. On the left, a single speaker plays a complex signal: a song that has both singing and instruments. We capture the speaker-membrane vibrations from a far and replay the played song with high fidelity. 

The right-side experiment is where things get interesting. Here, we record both speakers at the same time, while each speaker is playing a different audio file. The microphone has no choice but to record a mixture of both speakers. But our optical system, can capture each speaker separately, while being unaffected by the other speaker.

opening_final.png

global-shutter frame

rolling-shutter frame

microphone

Dual-shutter
vibration camera

recovered right
recovered left
recovered right
microphone
2) Sensing indirect sound vibrations
Back in 2014, Abe Davis along with his colleagues at MIT published an amazing paper titled "The Visual Microphone". In the paper, Davis et al. use a high-speed camera to capture and replay the indirect object surface vibrations caused by an adjacent sound source. In the experiment below, we qualitatively reproduce the chips bag setup from their paper, in which a chips bag is vibrating in reaction to the sound from a nearby speaker. In this experiment, Davis et al. image the chips bag with a high-speed camera while the bag is lit by a strong lamp (setup on the left). On the right is our setup, where we sense a chips bag with a single laser point.
 
Despite resulting from different experimental setups, which makes a direct and fair comparison difficult, it is evident both auditorily and by looking at the spectrograms that our system recovered the original audio with higher fidelity. Please refer to the paper for an in depth discussion about this an other relevant prior methods.

[The Visual Microphone, Davis et al. SIGGRAPH 14']

our setup

BTW: our system senses vibrations in 2D!
Our system measures the 2D shifts of the imaged speckle pattern. These 2D image-plane shifts roughly correspond to tilts of the illuminated object surface. The plot below shows the recovered x- and y-axis shifts when the speaker (from above) is playing a single tone. It is interesting to observe that different frequencies yield different vibrational modes on the speaker's membrane, which can be seen in the Lissajous curve on the right.

Lissajous curves for different speaker vibration frequencies

vertical surface tilts

horizontal surface tilts

speaker audio
Davis et al.
our recovery
3) Analyzing the vibrational modes of a 426Hz tuning fork
In this experiment, we 'place' multiple sensing points on a single vibrating surface: a 426Hz
tuning fork, which allows us to analyze the fork's vibrational modes. The fork has many
known modes that arise depending on how it is struck (see example modes below).
 
Here, we analyze the forks’ in-plane modes by observing the point's x-axis shifts.

First, we strike the fork with a rubber-tipper mallet. This relatively soft strike, mainly excites the fork's fundamental mode. This can be seen in the vibrations along the x-axis, whose amplitude is slightly increasing from point 0 to point 2.

The spectrogram and recovered recording on the right also show that our strike excited fundamental mode (click to hear).

Note that we simultaneously measure the y-axis shifts as well, although there's not much interesting going on there in this case.





Next, we strike the fork with a metal bar. This time the strike additionally induces the ’clang’ mode, which is about 6.26× higher than the fundamental. The clang mode vibrations are visible in the x-axis as a high-frequency modulating the fundamental mode. The clang mode induces an opposite phase between points 0 and 2 since these surface points tilt in opposite directions, while point 1 is approximately stationary. Hit the play button and see if you can hear the difference between the two strikes.



 

 



Striking the fork with the metal rod even harder yields an additional high-frequency in-plane mode as seen in point 1’s x-axis vibrations below. 


As mentioned above, the sensing point positions in this experiment allow the measuring of the fork’s in-plane vibrations. Out-of-plane vibrations can be analyzed by measuring the fork’s arm from above.

in-plane modes

out-of-plane modes

fundamental
426Hz

"clang"
2585Hz

    457Hz                  537Hz    

fork animations source: Daniel A. Russell

fork1.png
fork2_edited.png
fork3.png
4) Remote recording of musical instruments
Our system can be used to remotely record musical instruments (e.g. guitars, violins and more). Recording is done by pointing a laser beam at the instrument's vibrating surface.

In this experiment, we record a acoustic guitar and a violin being play by a musician. Note that our approach could handle the natural motions of the instrument as it is being played. These natural motions can be seen in the plots below as low frequency shifts spanning thousands of pixels. Conversely, the instuements' vibrations, spanning only a few pixels, can be seen as a high frequency signal that is modulating the macro motions (red lines in the inset). Click in the figure below to hear the microphone recordings and the recovered audio.


















 
 
In the next experiment, we again leverage the power of optical vibration by sensing two guitars simultaneouslyAs in the speakers' experiment, the microphone has no choice but to record a mixture of both signals, while our system records each instrument separately, giving us unmixed recordings.
BibTex​

@inproceedings{Sheinin:2022:Vibration,
  title={Dual-Shutter Optical Vibration Sensing},

  author={Sheinin, Mark and Chan, Dorian and O'Toole, Matthew and Narasimhan, Srinivasa G.},
  booktitle={Proc. IEEE CVPR},
  year={2022},
}

 

Acknowledgments​

We thank Aswin Sankaranarayanan for advice on building the optical system, Jeremy Smerd for playing the guitar and violin in the musical experiments, Dinesh Reddy, Tianyuan Zhang for help with experiments. This work was supported in parts by NSF Grants IIS-1900821 and CCF-1730147. 

 

Copyright © 2022 Mark Sheinin