Study On The Development Of An Automatic 3D Scanning Microphone Array
ABSTRACT
Microphone phased arrays, commonly known as acoustic cameras, are used by industries such as the aeronautical, automobile and construction industries to measure the magnitude and location of noise sources. These acoustic images are commonly obtained using beamforming and deconvolution algorithms. Traditionally, these acoustic images use 2D imaging planes. However, this can lead to errors in the nearfield for 3D objects due to incorrect beamform focusing. This paper outlines the development of a 3D scanning microphone array that automatically corrects for 3D objects and provides the correct acoustic imaging focus. New techniques for processing these 3D images are also described.
Introduction
The acoustic camera or microphone array has been developed to find the location of sound sources. Acoustic cameras are different from optical cameras where instead of 2-D array of photosensors, they have a 1-D microphone array to listen. Acoustic camera technology is also becoming essential for underwater exploration in noisy environment with low visibility.
Microphone arrays were developed as radar antennas in world war 2. Since then improvements have been observed and now microphone phase arrays are used from medical imaging to sound source detection. Acoustic waves travel through different medium and arrive at different microphones at different times depending on the location of sound source. A popular technique called beamforming is then applied to generate acoustic maps. Deconvolution techniques are then used to remove sidelobes and find the true location of sound source. This research is built on top of prototype developed by Massey University’s engineering students supervised by Dr Mathew Legg. The acoustic camera with 3D scanners developed by Heilmann et al used 3D scanner’s with infrared camera to create a 3D point cloud [Dynamic beamforming using moving phased arrays with integrated 3D scanners. The acoustic developed by Massey university used acoustic camera with structure sensor, but it was found that the results were hard to visualise for the people with no acoustic camera knowledge.
So, a market gap has been identified and the purpose of this research is to use a 3D camera which has capability to produce coloured point cloud for a better visualisation. An Iterative closest point algorithm could be used for registration of 3D shapes and then the shapes are merged and synced with the acoustic maps for display. This paper is organised as follows. The first section talks about Background Knowledge:
Acoustic Noise Problem
An acoustic camera, also known as a microphone phased array, is used to locate acoustic noises on an object. In Automobile industry, low-frequency booming noise is the most annoying noise, which can significantly reduce the total quality perception of the brand. When designing an automobile, a computational model will be developed before the physical prototype and is simulated to evaluate the structure borne sounds occur in a vehicle. The prototype is then manufactured to analyse using the vibro-acoustic analysis for low-frequency. The current models that are currently followed by industries are a bit complicated and requires sensors to be attached to the car’s internal body. Which significantly increases the costs and complexity when analysing cradles, axles and the bumpers [The effects of secondary components on the interior noise characteristics of automobiles].
In Aeronautical Industry, due to intense operation of jet airliners the noise control is considered are most important part. With the growing resentment of people and impact on local communities is pushing government and airports to establish new international laws on noise control. In 2001 ICAO issued a new standard to reduce engine noise of the planes [Noise Control Problems of Passenger Airplanes]. The current methods require, placing a car or a part of plane, for example, plane’s wing in a semi-anechoic wind tunnel and the acoustic devices are used to image the location and magnitude of the sound source. This is where the acoustic camera along with the 3D camera can significantly help to locate the noise sources and further steps can be taken to remove them.
Acoustic noise is a complex problem which requires specially designed hardware and software to locate and identify noise source accurately. Microphone phased arrays are comprised of multiple microphones shaped in a specific arrangement to avoid spatial aliasing, data acquitisation hardware, beamforming algorithms and deconvolution algorithms to generate acoustic maps of sound source.
The way this works is by using, microphone phased array to listen to the sound and a popular method called beamforming is then used to average the sound signals from different receivers and deconvolution algorithm is then used to remove sidelobe artifacts occurred due to beamforming, to accurately identify the location and magnitude of sound sources. Sound sources location and magnitude can be located using the three methods single microphone, microphone array and 3D scan vibrometer. Single microphone is commonly used to find the magnitude of the sound source. Another device commonly used by automobile industries is 3D scan Vibrometer, where it uses doppler shift phenomenon to measure surface motion in order to measure the surface vibration of an object. 3D scan vibrometer can be used to provide plane measurements and is generally used to detect structural damages.
Finally, microphone array, which have been developed to reduce problems caused by single microphones. Microphone phased arrays are used for mapping earth crust to ultrasonic imaging in labs and hospitals. The arrays provide increased signal to noise ration and reduces the effects of reverberation. The main concept is acoustic waves travel from different directions and arrive at different microphones at different times then beamforming can be applied to locate the sound source. It was found that microphone position can have an influence on the results and array consisting of spirals or of several circles with an uneven number of regularly spaced microphones seems to be the best.
Array Geometry
The design of microphone is really important for good beamforming results [Design of microphone phased arrays for acoustic beamforming]. Microphones are usually placed really close on a 2-dimensional plane to avoid aliasing, resulting in a lot of microphones. The area assigned to each microphone is determined from the number of rings, number of microphones and innermost annular section diameter [Design of microphone phased arrays for acoustic beamforming]. Several microphone arrays were designed and developed to reduce cost and avoid spatial filtering. Microphone arrays with deconvolution algorithms such as DAMAS (Deconvolution Approach for the Mapping of Acoustic Sources), DAMAS 2 and DAMAS 3 to remove properties of the array from the results of beamforming, in practice array still influence the quality of results. There are several shaped microphone array designs that are available since this is out of the scope of the project, acoustic camera developed during previous year by Massey University engineering students under the supervision of Dr Mathew Legg will used. The microphones are placed around the centre of equal area segments and can provides a good dynamic range over a wide frequency range.
Beamforming Acoustic waves propagate from the sound source and arrive at different times. Depending on the location and arrangement of the microphones the signals received will be out of phase with each other. Single DFT based Frequency domain beamforming is used in this research as testing multiple beamforming algorithms is out of the scope but during the research it was found that Geortzel algorithm and Sliding DFT could been tested with the acoustic cameras to see if they provide better results. Since, they haven’t been used before the results are currently unknown. Frequency Domain BeamformingFrequency domain beamforming is a result of the application of Fourier transform techniques to the beamforming process. DFT beamforming is the common method used while performing frequency domain beamforming. while time-domain beamforming generated wide frequency band acoustic maps, frequency domain beamforming generates beamformed maps at single frequency. Time delay in time domain corresponds to phase shift in frequency domain. DFT (Discrete Fourier Transform) beamforming involves implementing DFT techniques and the advantage of this method is sampling frequency doesn’t have an impact on resolution.
Deconvolution
Deconvolution is used to remove the sidelobe artifacts by using the known properties of array to identify the true location of sound source. Deconvolution algorithm such as Clean-SC for frequency domain is described below could provide better results as it has been previously tested by many but not utilised in this research as it is out of scope of this project.
CLEAN-SC
Clean based on spatial Source Coherence or Clean-SC was developed by Sijtsma to perform deconvolution in frequency domain. Clean-SC is used to remove the part of source plot which is spatially coherent with the peak source and has ability to extract sound power levels from the source plots. Clean-SC was investigated by Mathew and Bradley for 3D scanning surfaces and it was found that under certain conditions improvements in the accuracy of location and magnitude of sound sources were observed for 3D scanning surfaces compared to 2D scanning surfaces.
Scanning Surface
2D Scanning Surface
Beamforming and Deconvolution techniques have been performed traditionally assuming acoustic source lie on a plane. 2D scan surface is usually oriented perpendicular to the array’s Z-axis and located roughly at the same distance from the array as the object. According to Mathew and Bradley 2D scanning error appear as a parallax errors if the sound sources are offset to scanning surface. It was found that traditional 2D methods contain errors in the near field if the object being scanned is an 3D object. This can lead to errors in beamforming results. This can be resolved by using 3D scanning instead of 2D scanning methods.
3D Scanning Surface
To address the problems with the 2D scanning surfaces a multiplane forming a 3D grid of scan points have been developed by Mathew and Bradley. Where they presented a new 3D acoustic imaging technique which investigated deconvolution of beamforming maps generated using the 3D surface geometry. The way this technique works is by scanning 3D objects corresponding to their surface geometry to accurately find the location of sound source. The algorithm works by moving data shape ‘P’ to be in best alignment with the model shape ‘X’. So, objects can be reconstructed by merging the point cloud data. Iterative closest point is used to register the first point cloud for reference and the second point cloud is then transformed to match the first point cloud reference frame and the point clouds are stitched to produce final result.
Research Methodology
Related Work
A similar experiment had been conducted by. Chiariotti et al proposed a solution using the Microsoft’s Kinect sensor and microphone array to identify car cabins interior noise in the same reference frame. They performed beamforming on the same focus point from different location. The average beamforming technique samples stationary acoustic field by moving a microphone array to obtain each microphone position and make it possible to reduce pseudo-noise sources. The benefit with this is sidelobes are located differently so they can be easily identified and removed while accurately identifying the true location. Then Kinect 3D depth camera and iterative closest point algorithm is used to acquire geometry of the cabin. Then beamforming algorithm is mapped with the acoustic image to display image in the same reference frame.
Heilmann et al also used an acoustic camera integrated with the 3D scanner. They use 3d camera to create a 3D point cloud and stitch the frames together to create an 3d object. Then acoustic images were overlapped on to the 3d object for visualisation. [Dynamic beamforming using moving phased arrays with integrated 3D scanners]. The acoustic camera prototype used in this project was developed by Massey University engineering students last year as a part of their capstone project supervised by Dr Mathew Legg. During their prototype development they used structure sensor 3D camera for 3d object visualisation. Even though the camera they used solved their purpose, it was found that the results were hard to understand for person from a different background. Also, a new 3d camera (Intel Real sense D415) was used as the visualisation could be much better with the new camera. The Intel real Sense D415 camera can take 3 Dimensional pictures which later can be processed using MATLAB and Point cloud algorithms will be used to construct a 3d model and overlapped with the acoustic noise images for better visualisation. Intel real sense camera was first proposed by the capstone team from last year, but it wasn’t used because of its cons.
During my research it was found that the new upgrade versions of Intel real sense camera was found to be one of the better ones, as they have better FPS and has trigger option which can be utilised up on requirement A market gap has been identified as using 3D camera with coloured point cloud would have an advantage compared to others, as it would be easier to identify and visualise. At the beginning of this research a literature review had been done to find any gap in the research and enough gap has been identified. It was found that the acoustic camera along with 3D camera could provide better results compared to existing acoustic cameras.
The acoustic camera developed by Heilmann et al used structure sensor where it used point cloud methods to create 3d structures and map acoustic images. Castellini et al repeated beamforming measurements from different positions and found the changes using the Kinect sensor, where they got the geometry of the cabin and used that to find the change. Acoustic camera developed by Baden and their team from Massey University used acoustic camera with structure sensor to locate the sound source. The point cloud data has been collected by Intel Real Sense camera using the Intel Real Sense SDK and is stored as ply file (Polygon File Format), which can be later imported to MATLAB for post processing. Based on the example from MATLAB, pcread was used to import the already saved ply file from the computer and is down sampled to speedup the registration. Then Iterative Closest Point algorithm is used to estimate the 3-D rigid transformation. This is done by using the first point cloud as a reference and transformation is applied to the second point cloud.
The point cloud data will be then merged and synced with the microphone data for visualisation and analysis. ResultsThe point cloud data was produced using the intel real sense SDK and MATLAB was used to process the point cloud data. First a frame is saved and is imported to MATLAB, the imported data is then processed to filter out unwanted data. The images 1 and 2 are unfiltered images and the images 3 and 4 are the filtered images produced from the unfiltered data. The point cloud of image 3 then registered for the reference and the fourth image is transformed and merged to produce merged point cloud. The merged point cloud will be rotated and stitched to produce the final result. This final result is then overlapped and synced with the acoustic maps for visualisation.