AudioData

public class AudioData

Defines a ring buffer and some utility functions to prepare the input audio samples.

It maintains a Ring Buffer to hold input audio data. Clients could feed input audio data via `load` methods and access the aggregated audio samples via `getTensorBuffer` method.

Note that this class can only handle input audio in Float (in AudioFormat.ENCODING_PCM_16BIT) or Short (in AudioFormat.ENCODING_PCM_FLOAT). Internally it converts and stores all the audio samples in PCM Float encoding.

Typical usage in Kotlin

   val audioData = AudioData.create(format, modelInputLength)
   audioData.load(newData)
 

Another sample usage with AudioRecord

   val audioData = AudioData.create(format, modelInputLength)
   Timer().scheduleAtFixedRate(delay, period) {
     audioData.load(audioRecord)
   }
 

Nested Classes

class AudioData.AudioDataFormat Wraps a few constants describing the format of the incoming audio samples, namely number of channels and the sample rate. 

Public Methods

static AudioData
create(AudioData.AudioDataFormat format, int sampleCounts)
Creates a AudioRecord instance with a ring buffer whose size is sampleCounts * format.getNumOfChannels().
static AudioData
create(AudioFormat format, int sampleCounts)
Creates a AudioData instance with a ring buffer whose size is sampleCounts * format.getChannelCount().
float[]
getBuffer()
Returns a float array holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT i.e.
int
AudioData.AudioDataFormat
void
load(short[] src)
Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring buffer.
void
load(float[] src, int offsetInFloat, int sizeInFloat)
Stores the input audio samples src in the ring buffer.
void
load(short[] src, int offsetInShort, int sizeInShort)
Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring buffer.
int
load(AudioRecord record)
Loads latest data from the AudioRecord in a non-blocking way.
void
load(float[] src)
Stores the input audio samples src in the ring buffer.

Inherited Methods

Public Methods

public static AudioData create (AudioData.AudioDataFormat format, int sampleCounts)

Creates a AudioRecord instance with a ring buffer whose size is sampleCounts * format.getNumOfChannels().

Parameters
format the expected AudioData.AudioDataFormat of audio data loaded into this class.
sampleCounts the number of samples.

public static AudioData create (AudioFormat format, int sampleCounts)

Creates a AudioData instance with a ring buffer whose size is sampleCounts * format.getChannelCount().

Parameters
format the AudioFormat required by the TFLite model. It defines the number of channels and sample rate.
sampleCounts the number of samples to be fed into the model

public float[] getBuffer ()

Returns a float array holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT i.e. values are in the range of [-1, 1].

public int getBufferLength ()

public AudioData.AudioDataFormat getFormat ()

public void load (short[] src)

Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring buffer.

Parameters
src input audio samples in AudioFormat.ENCODING_PCM_16BIT. For multi-channel input, the array is interleaved.

public void load (float[] src, int offsetInFloat, int sizeInFloat)

Stores the input audio samples src in the ring buffer.

Parameters
src input audio samples in AudioFormat.ENCODING_PCM_FLOAT. For multi-channel input, the array is interleaved.
offsetInFloat starting position in the src array
sizeInFloat the number of float values to be copied
Throws
IllegalArgumentException for incompatible audio format or incorrect input size

public void load (short[] src, int offsetInShort, int sizeInShort)

Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring buffer.

Parameters
src input audio samples in AudioFormat.ENCODING_PCM_16BIT. For multi-channel input, the array is interleaved.
offsetInShort starting position in the src array
sizeInShort the number of short values to be copied
Throws
IllegalArgumentException if the source array can't be copied

public int load (AudioRecord record)

Loads latest data from the AudioRecord in a non-blocking way. Only supporting ENCODING_PCM_16BIT and ENCODING_PCM_FLOAT.

Parameters
record an instance of AudioRecord
Returns
  • number of captured audio values whose size is channelCount * sampleCount. If there was no new data in the AudioRecord or an error occurred, this method will return 0.
Throws
IllegalArgumentException for unsupported audio encoding format
IllegalStateException if reading from AudioRecord failed

public void load (float[] src)

Stores the input audio samples src in the ring buffer.

Parameters
src input audio samples in AudioFormat.ENCODING_PCM_FLOAT. For multi-channel input, the array is interleaved.