Demo for 'Working With Audio - ALSA'

This lab is going to take a look at how to work with audio using an open source library called ALSA. Some resources that will help to get you started on your own are the API (and the API description), some Tutorials and Presentations and the official ALSA Developer Zone.

To start, we are going to check the sound levels of the system. Run "alsamixer" from the terminal and you should see an interface as shown in the picture below. The "M" key mutes and enables channels, the up and down arrow keys change the volume level, the left and right arrow keys switch between the mixer levels, and the escape key will exit the mixer program. Make sure that the sound levels on the computer are maxed and that the muted symbol isn't placed above any of the channels. If after running the demo for the first time you find this to be too loud you can lower the master level to a more comfortable volume.

Next, download the template project for lab 3 here. This code provides a basic, proof of concept, starting point for the demo we are going to do today. The code compiles with the command "g++ -o alsatut1 alsatut1.cpp -lasound" and can then be run with the command "./alsatut1". If you run the code you should hear a buzzing tone being played from the speakers.

To dissect why this is happening take a look at the Init() function in alsatut1.cpp. The first few if statements in the function set all the parameters for the sound card. First the device is chosen "plughw:0,0". To see all of the sound devices on your computer (if running this from outside of the VR lab) you can run the command "aplay -L". This will list the various sound devices available.

The next parameter that is set controls the transfer mode (how you send the audio data to the sound card). The two types of transfer modes are regular (you send the data using the snd_pcm_write* functions) and memory mapped (writing directly to the sound card using a pointer). This tutorial uses the regular transfer mode. The transfer mode also describes how the data being sent to the sound card will be represented. The main choices are interleaved and non-interleaved. When choosing the interleaved option data is sent to the sound card in frames, where each frame has 2 bytes of data (a single sample) for each of the audio channels being supported by the code. When choosing the non-interleaved option the data is transfered in periods, where each period has a specific number of samples sent for a given channel. When dealing with a mono setup this detail isn't important but the more channels you have the more problems you are going to encounter when using a non-interleaved approach. For more details on this topic look here.

  if(snd_pcm_hw_params_set_access(pcm_handle, hwparams, SND_PCM_ACCESS_RW_INTERLEAVED) < 0) {
    fprintf(stderr,"Cannot set interleaved format\n");
    return false;
  fprintf(stderr,"In interleaved format\n");

The next parameter defines how the format for the samples being sent to the audio card. In this tutorial, the samples sent to the audio card will each be represented using 16-bit , little endian, signed values.

  // 16 bit, little endian, signed
  if(snd_pcm_hw_params_set_format(pcm_handle, hwparams, SND_PCM_FORMAT_S16_LE) < 0) {
    fprintf(stderr,"16 bit samples\n");
    return false;
  fprintf(stderr,"sample rate set\n");

The following parameters attempt to set the sampling rate of the stream (in Hz) to 44100. The function "snd_pcm_hw_params_set_rate_near" will attempt to set the sampling rate to this but may end up setting the actual sampling rate to something slightly different. For this reason the function takes a pointer to an unsigned integer so that the rest of the code can make use of the actual rate. The buffer size is also set to be 4K in size and then the data buffer is created.

  unsigned int rate = 44100;
  if(snd_pcm_hw_params_set_rate_near(pcm_handle, hwparams, &rate, 0) < 0) {
    fprintf(stderr,"error setting rate\n");
    return false;
  fprintf(stderr,"Card likes rate of %d\n",rate);

  int n = max_channels(pcm_handle, hwparams);
  fprintf(stderr,"We have %d channels\n",n);

  /* each frame is n channels * 2 bytes */
  int buffersize = 4096;
  int nSamples = buffersize / (2 * n);
  fprintf(stderr,"Buffer holds %d samples\n",nSamples);
  if(snd_pcm_hw_params_set_buffer_size(pcm_handle, hwparams, buffersize)) {
    fprintf(stderr,"Error setting buffer size\n");
    return false;
  fprintf(stderr,"ready to apply parameters to device\n");

The code shown below writes samples to the sound buffer. Currently the code supports two channels of output. This is equivalent to a left and right channel on a stereo speaker setup. The buffer creates room for nSamples 2-byte samples for each of the n channels (remember that an unsigned char is represented in one byte in the ANSI C and C++ standards). The sample creates two "sawtooth" waveforms (of different amplitudes and frequencies) and sends one of them to the first audio channel and one to the second channel. When filling the buffer the integer value is masked and shifted so that the two unsigned char values, representing the sample, are in little endian format.

  // create a static buffer (we will play this over and over)
  unsigned char *data = (unsigned char *)malloc(nSamples * n * 2);

  for(i=0;i<nSamples;i++) {
    int level = (i % 64) * 1000 - 5000;
    int level2 = (i % 32) * 10 - 5000;

    int offset = i * n * 2;
    int j;
    for(j=0;j<n;j++) {
      switch(j) {
      case 0:
        data[offset+2*j] = (unsigned char) (level & 0x0ff);
        data[offset+2*j+1] = (unsigned char) (level >> 8);
        data[offset+2*j] = (unsigned char) (level2 & 0x0ff);
        data[offset+2*j+1] = (unsigned char) (level2 >> 8);
  fprintf(stderr,"data created\n");

Bonus 1

If the second channel is silenced (set level2 to 0 or any other DC value) you can easily play with the mono waveform on the remaining channel. Try to increase the frequency of the waveform generated for the first channel.

Bonus 2

Make one tone play out of one speaker then have a different tone play out of the second speaker.

Solutions to bonus questions will be posted after the lab. Please ask for help during the lab with these questions.