The audio part is very different from other systems I've hacked. The audio part of the Gnome SDK consists of GStreamer, PulseAudio and Canberra. GStreamer is a multimedia framework. PulseAudio is a lower level abstraction: receiving raw audio and shipping it to the sound card. Canberra is a simple library that allow you to play short sounds. To be fair, Canberra would probably have been the way to go if all I wanted was a beep, but, the long range goal is to make an audio engine rather like the Game Boy Advance, and Canberra isn't really built for that. So GStreamer or PulseAudio are the way forward.
So, step one for me -- my version of "hello world" -- is to create a PCM representation of a square wave and then play it. I tried GStreamer first over PulseAudio for no particular reason.
GStreamer Hates You
OK. A slight exaggeration. GStreamer is complicated because it is not an audio library, it is a multimedia framework designed to encode and decode every format under the sun, apply filters, mux, demux, encode, decode, recode. It has an underlying framework that generic, extensible, and somewhat self-documenting.To ask it to do something as trivial as playing a raw audio wave is almost beneath its dignity, and it lets you know that by requiring you to build an entire audio framework to do anything at all.
So, I've suceeded in making a beep. I'm the greatest coder in the world, ever. Oh, you're not impressed? Then I'd like to see you try to extract wisdom from the unsorted pile of knowledge that are the GStreamer docs. I used GIO to make a stream out of a memory buffer that contained my raw PCM sound data, and then streamed that to ALSA using GStreamer.
Explore my code fumblings after the jump.
GStreamer 101
Hopefully you've looked at GStreamer's tutorial docs. The explain the basic construction of a GStreamer app. In essence, to make GStreamer make some audio, you have to create a set of _elements_ and string them together in a _pipeline_. Ideally, in the 'beep' case, you'd only have to use the minimum number of elements, which is two.- An element that takes the in-memory representation of a beep and convert it into a GStreamer audio source
- An element that receives the GStreamer audio source and sends it to the speakers
The Sink
Like any GStreamer pipeline, it is usually easiest to start at the end. And in my case, the end for me is ALSA, the Advanced Linux Sound Architecture. The ALSA library is the lowest level of the sound stack that any programmer would want to deal with. It pushes sound to the Linux kernel and out through the sound card. Most fairly recent Linux boxes with sound cards can be addressed using the ALSA API and library. But I'm not coding ALSA at all. I'm letting GStreamer deal with ALSA using its ALSA module. But the important thing here is to see what data representations the ALSA element is expecting.The gst-inspect command-line utility will list the pre-made elements available on your box. On my box gst-inspect shows that I have a module called alsa that has an element called alsasink. That's going to be the end of my pipeline. It will ship my sound to the soundcard.
So, lets see what formats alsa is expecting. I type `gst-inspect alsasink` to see what it can do.
Pad Templates:
SINK template: 'sink'
Availability: Always
Capabilities:
audio/x-raw-int
endianness: { 1234, 4321 }
signed: { true, false }
width: 32
depth: 32
rate: [ 1, 2147483647 ]
channels: [ 1, 2147483647 ]
audio/x-raw-int
signed: { true, false }
width: 8
depth: 8
rate: [ 1, 2147483647 ]
channels: [ 1, 2147483647 ]
audio/x-iec958
(Lots of extra info was removed for brevity.)
The important thing to see here is that for data that the Alsa module receives (or _sinks_ in GStreamer parlance) has to be of type `audio/x-raw-int` or `audio/x-iec958`, and it can only be of certain rates and bits-per-sample. `audio/x-raw-int` is GStreamer's internal representation of raw audio, basically an integer array of uncompressed PCM data.
The other terms should be pretty obvious. `signed: { true, false }` shows that the data can be packed as either signed or unsigned integers. `width` is the number of bits used to store one sample. `depth` is the number of bits that are actually significant, and is usually the same as `width`. `rate` is the sampling rate of the data, so numbers between 8,000 and 50,000 would be expected. `channels` is the number of channels that this data contains: 1 for mono, 2 for stereo, etc.
The Source
Sweet. Now I need to create some `audio/x-raw-int` data to ship to the Alsa receiver. This is actually harder than it sounds.
Fundamental to GStreamer is the idea of, erm, 'streaming'. I don't just point the Alsa sink to the source data; I have to create a framework where the data is passed to Alsa as it is required. In other words, the souce data has to be streamed.
Since my app is going to take full advantage of GTK+, I can use the GIO library to convert my in-memory array into a streaming source. GIO is the GTK project's abstraction of file streams. The operate similarly to the standard stdio.h libc streams (fopen, fread, fclose, etc) except that they use a much larger range of sources and devices, including receiving data over internet protocols or (as is interesting for this discussion) accessing memory buffers as if they were files.
So, I can take my memory buffer, wrap it up as a GIO data source, and then it is ready to be streamed piece by piece to my Alsa sink.
Starting at the beginning
Now that you've totally been bored by a bunch of text, I'm going to start at the beginning and show some actual code that plays my beep.First, I'll make an 8-bit unsigned array containing a square wave at 440 Hz sampled at 22000 Hz. In other words, I'll generate an array where each cycle is 2.272 ms long and each sample is 0.045 ms long. There are, thus, 50 samples per cycle. Since this is a simple square wave, there will be 25 'high' samples and 25 'low' samples per cycle. Since this is 8-bit unsigned, high is 255 and low is 0.
My beep will be 2 seconds long, so it will have 44000 samples.
#define SAMPLE_RATE 22000
#define FREQ 440
#define DURATION 2
#define LEN (SAMPLE_RATE * DURATION)
int main (int argc, char ** argv)
{
int i;
uint8_t *wave;
GMemoryInputStream *mistream;
GstElement *source, *sink, *pipeline;
GstPad *sourcepad;
GMainLoop *loop;
gst_init (&argc, &argv);
wave = g_new (uint8_t, LEN);
for (i = 0; i < LEN; i ++)
{
if (i % 50 < 25)
wave[i] = 255;
else
wave[i] = 0;
}
Next, I'll package my data as a GIO stream.
mistream = G_MEMORY_INPUT_STREAM(g_memory_input_stream_new_from_data(wave,
LEN,
(GDestroyNotify) g_free));
Then, I'll create my GStreamer source using this GIO stream.
source = gst_element_factory_make ("giostreamsrc", "source");
g_object_set (G_OBJECT (source), "stream", G_INPUT_STREAM (mistream), NULL);
It is vital that I explicitly state the properties of the data in the stream. For raw binary data, GStreamer has no way of knowing what kind of data it is unless I tell it.
sourcepad = gst_element_get_static_pad(source, "src")
gst_pad_set_caps (sourcepad,
gst_caps_new_simple ("audio/x-raw-int",
"rate", G_TYPE_INT, 22000,
"channels", G_TYPE_INT, 1,
"width", G_TYPE_INT, 8,
"depth", G_TYPE_INT, 8,
"signed", G_TYPE_BOOLEAN, FALSE,
NULL));
gst_object_unref (sourcepad);
Next, I create the Alsa element.
sink = gst_element_factory_make ("alsasink", "sink");
These get packaged into a pipeline. I already know that my `giostreamsrc` will connect with the `alsasink` because I've checked the data types that `alsasink` will receive and I know that I'm sending it one of those data types.
pipeline = gst_pipeline_new ("beep-player");
gst_bin_add_many (GST_BIN (pipeline),
source, sink, NULL);
gst_element_link_many (source, sink, NULL);
gst_element_set_state (pipeline, GST_STATE_PLAYING);
The GLib main loop manages all the traffic, so I have to connect my pipeline to the main loop.
loop = g_main_loop_new (NULL, FALSE);
g_timeout_add (2500, (GSourceFunc) timer_callback, loop);
g_main_loop_run (loop);
gst_element_set_state (pipeline, GST_STATE_NULL);
gst_object_unref (GST_OBJECT (pipeline));
g_main_loop_unref (loop);
return 0;
}
Then, when I kick off the main loop, it plays the sound. The main loop would never return because nothing is calling `g_main_loop_quit`. I add a timer to call a function `timer_callback` to kill the program after 2.5 seconds.
int timer_callback (const void *data)
{
g_main_loop_quit ((GMainLoop *) data);
return FALSE;
}
The complete code for this test is in the Project Burro tree as experiment/gstreamer-beep.c
No comments:
Post a Comment