Interview: Nathan Charles, Khronos Group

The Khronos Group is known for its successful OpenGL standard API for graphic applications and gaming and has recently announced the OpenSL ES Sound Language for mobile devices, and developer Hayden Porter interviewed Nathan Charles, lead for the Khronos OpenSL ES working group and Software Architect at Creative Labs, about the OpenSL ES API.

Mathew Kumar, Blogger

November 13, 2007

13 Min Read

Title The Khronos Group, well known for its successful OpenGL standard API for graphic applications and gaming, recently announced the OpenSL ES Sound Language for mobile devices such as phones and personal media players. OpenSL ES is to provide a standardized, high-performance, low-latency method to access audio functionality for developers of native applications on embedded mobile multimedia devices, with the aim of straightforward cross-platform deployment of hardware and software audio capabilities, reducing implementation effort, and promoting the market for advanced audio.

In September 2007, Khronos issued OpenSL ES 1.0 as a provisional specification to provide the opportunity to incorporate feedback from the developer community before the specification is finalized. Khronos plans to finalize the specification towards the end of 2007, and in the mean time is seeking input from the industry.

Developer Hayden Porter conducted an interview about the OpenSL ES API with Nathan Charles, lead for the Khronos OpenSL ES working group and Software Architect at Creative Labs.

Hayden Porter: Can you provide a brief overview of the OpenSL ES API features? Are they a subset of the OpenAL (Open Audio Language) API? Are the features grouped together by different functionalities? For example is there a set of APIs for MIDI interaction, and a different set for audio?

Nathan Charles: OpenSL ES is an entirely new API developed from scratch. We looked at other open audio APIs, such as OpenAL, but these are mainly targeted for game use on PC and games consoles. Our demands were slightly different: OpenSL ES needs to support a wide range of applications beyond games and it must run on resource constrained devices. It was clear to us that a new API would be the best way of achieving our aims.

OpenSL ES supports all the features you'd expect from an advanced audio API targeted for mobile embedded devices. The feature list is fairly long, including basic digital audio and MIDI playback (with support for SP-MIDI, mDLS and mXMF) but also more advanced features such as 3D audio, virtualization, MIDI messages, and effects including reverb and EQ. Additionally OpenSL ES provides APIs for controlling the LED and vibra, and for accessing metadata included in audio files so that applications can display information such as the track name and author.

OpenSL ES uses an object-oriented based architecture. The most basic objects are "players" and "recorders", so for example, you can create MIDI players and digital audio recorders. The features are then encapsulated in interfaces, with each interface containing controls for similar functionality. For example, the volume interface exposes controls for volume, balance and pan. One of the nice things about this architecture is that it's possible to use the same interface on different objects. So, for example, the volume interface can be exposed on a digital audio player, a MIDI player, an audio recorder and the global output.

HP: Can you summarize the different device profiles and their capabilities?

NC: With such a large feature set it wouldn't be reasonable or even appropriate to expect all devices to support the full range of features. One solution is to make lots of the features optional. The problem with this is that developers cannot rely on particular features being present so they either have to produce many versions of their application or only use the features guaranteed to be present on all devices. This can lead to a bad developer experience and ultimately a lower adoption of the API.

Instead, we decided to segment the functionality into profiles. Profiles often have a bad name but we think we've got the profiles in OpenSL ES just right. OpenSL ES has three profiles: Phone, Music and Game. Each profile contains the functionality most suited to its application area, with some features in more than one profile.

The Phone profile is aimed at mobile phones and supports ring tone playback, device UI sounds and audio for simple 2D games. The Music profile is for music playback applications and includes the ability to support multiple audio codecs and supports features including stereo widening. The Game profile is aimed at advanced games, with support for 3D audio, reverb and advanced MIDI.

HP: Do the OpenSL ES API profiles build upon each other, for example, does the game profile include functionality of both the phone and music profiles, or are the features of each profile independent?

NC: Each profile has been designed independently so that it contains all the functionality appropriate for the market it supports. Consequently the profiles do overlap, so there are areas of functionality (e.g. basic playback of PCM) that are present in all three profiles. Importantly though, the profiles are not levels, so it's not true that any one profile contains all the functionality of another profile. The developer doesn't need to worry about this though, they just write an application using the profile that most suits their application and it guaranteed to work on devices supporting that profile.

HP: Is there a way to use the API itself to identify the type of profile implemented on a given device?

NC: Yes, there's a query mechanism to determine the profiles supported but as the profiles are market-oriented it should be clear what profile to expect on a given device. For example, a game-oriented phone is likely to expose both the Phone and Game profiles, whereas a portable music player may just expose the Music profile. A smart-phone may expose all three profiles.

HP: Is the OpenSL ES API based upon or very similar to any existing mobile sound APIs such as those from JSR135/234 or perhaps BREW imedia API? Or is it a new API that has the same functionality as these other APIs but a different language structure? How portable would this API be for applications based upon already existing APIs?

NC: Although OpenSL ES is not based on any pre-existing APIs most of the concepts will be familiar to those used to other audio APIs. We've purposely kept away from reinventing the wheel, exposing features in a similar manner to other audio APIs. So, for example, those coming from a JSR134/234 or a PC audio background shouldn't have much trouble understanding OpenSL ES. Of course it's always slightly tricky porting an application from one API to another but porting to OpenSL ES shouldn't require too much work. We've got a number of members of the JSR234 expert group in the OpenSL ES working group, so that's helped us keep things consistent.

HP: Is OpenSL ES intended for a specific programming language, or can it be applied to common to mobile device development platforms such as J2ME, or perhaps Flash ActionScript?

NC: OpenSL ES is a C language API, however, you can expect that bindings will start to appear for other languages once the specification is released.

HP: It seems that device manufacturers are relying upon different components to manage playback of media in phones, one for sound, one for 2D graphics, which may make it difficult to integrate sound and graphics in an effective way. As an example, if a developer intendeds to create a mobile device application that plays animated greeting cards supporting sound/graphic re-synchronization between SVG animation, rendered by one phone component, and a MIDI sound track, by a different component, how might the OpenSL ES API make it easier to develop this sort of application or, alternatively, offer better sound/graphic re-synchronization compared to existing technologies?

NC: The Khronos group appreciates this problem and has been working on OpenKODE, which amongst other things, is ensuring that at least the different Khronos APIs (3D graphics, 2D graphics, audio, video) can be synchronized. We also know that OpenSL ES is likely to be used with other APIs though so the API exposes a couple of features that should help synchronization, such as position-related and marker callbacks that can inform the application when a certain piece of audio has been played.

HP: How might OpenSL ES integrate with OpenGL ES for both 3D sound and graphics? Would this require OpenKODE or can integration occur with existing OpenGL ES implementations?

NC: The OpenSL ES Game profile is an ideal companion to OpenGL ES as it's possible for applications to include both 3D graphics and 3D audio together. The main use of this will be games but there are other applications such as advanced 3D music virtualization that may make use of both these APIs. Device manufacturers have the choice of providing OpenSL ES and OpenGL ES as part of OpenKODE but this isn't necessary. The advantage of providing the APIs as part of OpenKODE is that the developer has extra assurance that the API implementations work well together and they also have access to the other OpenKODE APIs to handle things like input (e.g. joystick input). But from a developer's point of view, they will have access to OpenSL ES and OpenGL ES in both cases, and they would program to the APIs as they would normally.

Headshot
Nathan Charles

HP: Does the OpenSL ES API have any specific support for radio applications (FM, digital) or streaming audio applications?

NC: OpenSL ES provides APIs for accessing content located on remote networks but this is an optional feature. The API does, however, expose a buffer queue mechanism that can be used by applications to easily and efficiently stream data directly into the audio system. As for radio, there are no plans to include radio support in the first version of OpenSL ES but OpenMAX AL, another Khronos API in development, does plan to provide support for analogue radio and RDS.

HP: Does Khronos have any plans for a video control API?

NC: Yes, it's called OpenMAX AL and it's an application-level multimedia API. As well as supporting video, image, camera and radio it also has some support for audio, as you'd expect from a multimedia API. OpenMAX AL is on approximately the same timeline as OpenSL ES so the two working groups have chosen to collaborate on both the API architecture and the audio features present in both APIs. This will mean the APIs will be consistent, and more importantly, compatible, making it easier for developers to switch between APIs. We expect that devices will ship with either or both the APIs, depending on the device's requirements.

HP: Besides the analog radio APIs what other sound features might we expect from OpenMAX AL?

NC: OpenMAX AL will support the playback and recording of audio files (both sampled audio and MIDI), as OpenSL ES does. Any more advanced audio features are only supported in OpenSL ES.

HP: It is clear that developers benefit from open, royalty free APIs because these APIs make cross platform development easier. Also manufacturers of devices using open platforms like Windows Mobile, Symbian and Palm OS would benefit because developers would create more software for these platforms. However, for handheld devices, it seems there are many closed and proprietary systems.

Why should a manufacturer producing digital audio players be interested in using the OpenSL API, especially if all programming is handled in house and through a closed proprietary system and is not likely to have a community of 3rd party developers creating applications for the platform?

NC: This is a good question and one that applies not only to digital audio player devices but also to the many mobile phones that do not allow 3rd party applications. Even in these cases where all the development happens in-house there is still a lot to be gained for the manufacturer from using an open standardized API. By using an open API the manufacturer is not tightly bound to one vendor and can swap solutions without rewriting all their applications. This means the manufacturer is able to choose the best solution to meet their objectives.

It's also worth noting that it takes significant amount of time and resources to design a good API. In OpenSL ES's case there have been at least ten active companies working for about two years on designing the API. Although a manufacturer could choose to develop a proprietary API from scratch for their in-house software they can save time using OpenSL ES instead.

HP: What is involved for a manufacturer to implement the API on a device? Is the API something that most manufacturers are likely able to develop their own software for, or does it seem more likely that manufacturers will license an existing audio engine SDK that supports the API?

NC: It depends on the manufacturer. Some may choose to implement OpenSL ES on top of their current audio system. Others may approach vendors that have specific OpenSL ES solutions. It also depends on the profile in question. Most manufacturers already have solutions to support, for example, the Phone profile. However, the Game profile is more advanced and it's likely the manufacturers will work with partners to developer their OpenSL ES solution.

HP: Are there any suggested specifications for a device to support the various API profiles?

NC: Each profile will have a set of minimum requirements that must be satisfied in order produce a conformant implementation. These requirements are currently in the process of being finalized. Our aim is to ensure that the application developer has a good set of guaranteed functionality to rely on between different OpenSL ES implementations but we also want to make sure that OpenSL ES is at the right level to able to be supported by a wide-range of devices.

HP: Can you list the companies and organizations that have contributed to the development of the OpenSL ES API?

NC: The full list of companies is quite large, but the core companies include: AMD, Beatnik, Coding Technologies, Creative, Ericsson, Freescale, Nokia, NVIDIA, NXP, QSound Labs, Samsung, Sonaptic, ST Microelectronics, Symbian and Texas Instruments. It's a good group as we've a got a wide range of experience in both hardware and software, as well as experience from the mobile and PC industry. We admit that we're missing two groups of people: mobile network operators and application developers. Although the working group members have close contact with these groups we want to ensure we get the widest range of input so we're going to release a provisional version of the specification for public review before we make the final release. We're aiming to release a provisional version in mid-year. We'll then solicit feedback and make appropriate changes to the specification before the final release.

HP: When might we see devices coming to market that support the OpenSL ES API?

NC: It generally takes about a year between the time a specification is released and the time devices come to market. We don't have a final release date yet but there shouldn't be much of a gap between the provisional release and final release of the specification.

[Hayden Porter is a developer with a special interest in interactive sound for the web and mobile. He has written extensively on the subject including articles for Nokia, Sony Ericsson, Electronic Musician Magazine, Music Technology Magazine and DevX.com. www.aviarts.com]

About the Author

Mathew Kumar

Blogger

Mathew Kumar is a graduate of Computer Games Technology at the University of Paisley, Scotland, and is now a freelance journalist in Toronto, Canada.

See more from Mathew Kumar

Related Topics

Related Topics

Recent in More

Related Topics

Interview: Nathan Charles, Khronos Group

About the Author

Latest News

Opinion: How will Project 2025 impact game developers?

More from GD

Featured Blogs

Related Topics

Related Topics

Recent in More

Related Topics

<span class="ArticleBase-LargeTitle">Interview: Nathan Charles, Khronos Group</span>Interview: Nathan Charles, Khronos Group

About the Author

Latest News

Opinion: How will Project 2025 impact game developers?

More from GD

Featured Blogs

Interview: Nathan Charles, Khronos Group