Printable PDF Version

Making the switch to digital audio

Shannon Gwin Mitchell, James A. Peterson, and Serdar Kaya

Shannon Gwin Mitchell, Ph.D., Researcher, Johns Hopkins University, Bloomberg School of Public Health, Baltimore, Maryland, USA

James A. Peterson, Ed.D., Research Ethnographer / Program Manager, Johns Hopkins University, Bloomberg School of Public Health, Baltimore, Maryland, USA

Serdar Kaya, M.S., Network Systems Manager, Johns Hopkins University, Baltimore, Maryland, USA

Abstract: In this article, the authors describe the process of converting from analog to digital audio data. They address the step-by-step decisions that they made in selecting hardware and software for recording and converting digital audio, issues of system integration, and cost considerations. The authors present a brief description of how digital audio is being used in their current research project and how it has enhanced the “quality” of their qualitative research.

Keywords: digital audio, qualitative analysis software, recording equipment, system compatibility

Citation information:
Mitchell, S. G., Peterson, J. A., & Kaya, S. (2004). Making the switch to digital audio. International Journal of Qualitative Methods, 3(4), Article 6. Retrieved INSERT DATE from http://www.ualberta.ca/~iiqm/backissues/3_4/html/mitchell.html


At The Lighthouse (a community-based research unit of the Department of Health Policy and Management of the Johns Hopkins Bloomberg School of Public Health), we conduct community-based HIV/AIDS prevention research with injection drug users. We use qualitative research methods, such as individual interviews or focus groups, in all phases of our interventions, from formulation to implementation to evaluation. On previous projects, we have done things the old-fashioned way, using cassette tapes with grainy sound quality, no matter how expensive the recorder, and working with transcriptionists who charge U.S.$200 for a Word document that still needs to be “cleaned” before it can be analyzed. This process includes keeping track of multiple copies of cassettes, having limited access to the tapes (depending on how many copies were made), and making decisions about when to archive the cassettes, when to tape over them, what to have transcribed, and when to have the transcriptions done. For our newest HIV prevention project, we decided to “go digital,” an intuitively simple yet technically complex change requiring research into a variety of software, hardware, and compatibility issues. By describing our transition to digital audio, we hope to show other qualitative researchers one approach for how the process can be accomplished.

Why make the switch?

The transition to digital was prompted by the start of a new project and spurred on by the ghosts of research past, reminding us of transcription and archiving issues that were never resolved to anyone’s complete satisfaction. Earlier studies had left us with multiple copies of cassette tapes that were bulky to store, difficult to keep track of, and useful for analysis only in their transcribed form. We hoped that by switching to a digital audio format, we would solve many of our analysis and archiving problems. In addition, recent technological developments (and greater availability) of several good audio-coding systems inspired us to investigate digital audio. Although some of these systems were new (e.g., AnnoTape), others had evolved quite substantially in their technical capabilities (e.g., Atlas.ti) since we had collected data for our previous major research project.

Software, hardware, and cost considerations

In starting the process, our primary concern was that the digital components be compatible with one another, so we made our selections with one eye on system integration and the other on ease of use. Our “third” eye (the one that you often wish were blind) was always focused on cost. We began the transition to digital by examining and selecting data analysis software packages that would meet our needs. The programs with audio-coding capabilities that we explored included AnnoTape 1.0, HyperRESEARCH 2.6, and Atlas.ti 5.0. In the end, we chose Atlas.ti because of the variety of audio formats it accepted, its video capabilities (possibly for future projects), and its price. All three software programs are similarly priced if purchased new (i.e., an educationally priced five-pack was U.S.$1,890), but the cost of upgrading from a previous version of the software was less than half the full cost (i.e., an educationally priced upgrade five-pack was U.S.$851). Ultimately, Atlas.ti also proved more appealing than its analysis counterparts because we had used it in other research studies and knew that we would require less training to get up to speed on its layout and functioning, at least from a text-based perspective.

Our second major decision concerned the purchase of digital recording equipment. After much research, we bought two Olympus recorders (DS-330, U.S.$150; DS-2000, $240); both came with 16 megabytes of memory (built in) and more than 5 hours of recording time on long-play (LP) mode, but the DS-2000 also had a Smart Media card slot for expandable memory. The digital recorders came with USB interface capabilities for quick and easy downloading of recorded data onto a computer. However, one thing to keep in mind is that Olympus digital recorders (like most others) use a compressed recording format (called DSS) that must be converted to an alternative audio format before being exported into analysis software. Atlas.ti 5.0 accepts a number of digital audio formats, including .wav, .au, .snd, and .mp3. In selecting between audio formats, researchers must consider file size and the amount of available hard drive storage space on the computer. To keep memory usage to a minimum, we opted to save everything as .mp3 (a compressed file format); this reduces the file size to approximately one tenth the size of a .wav file while still retaining the data in a standard format that is playable on most factory-installed PC audio players. Although some DSS formats can be converted directly to .mp3, the Olympus version of DSS must first be converted to .wav before being converted to other alternative audio formats (such as .mp3). This adds a brief—but necessary—additional step to the process of exporting digital files to your computer.

The conversion process

The audio conversion process is quickly and easily accomplished using Audacity 1.2.3 (http://audacity.sourceforge.net). Audacity 1.2.3 is a free audio editor that allows you to record, play, edit, mix, import, and export sounds in various formats. By first downloading a DSS file from the recorder and saving it to your computer’s hard drive in .wav format (using the Olympus software), you just need to open the file in Audacity 1.2.3 and resave it as .mp3. Using this process, a file can be converted from a “raw” DSS file to .mp3 in less than 5 minutes from start to finish.

When we decided to make the transition to digital audio, our new research project was already in its pilot phase, so we gathered data solely on standard cassette tapes that were later converted to digital format. This process involved the use of the Audacity 1.2.3 software and what we fondly referred to as “the magic cable” (purchased for less than U.S.$5 at a local computer store); this cable has two male adapters, one to plug into the external speaker port on the cassette recorder and the other to plug into the microphone port on the back of the computer. Once digitized, all audio files were saved as .mp3s. The sound quality, however, was far superior with the Olympus digital recorders, as these eliminated all of the background white noise that is common with even the best analog tape recorders.

Because multiple digital audio copies can be made in very little time, the “raw” data were readily available to any team member who had access to a computer and a link to our network (in this case, everyone working on the project). The recordings still needed to be organized, which meant the creation of a new audio database, but they were almost immediately accessible to all of our researchers. The new audio files also required storage, though not of the traditional boxes-and-file-cabinet variety. To ensure that we had sufficient storage for our new audio files, we purchased an external hard drive (approximately U.S.$250 for a 250 gigabyte system), which we linked with our network. All researchers were given read/write access to the hard drive, and when copies of files were requested, CDs or DVDs of the recordings were made. Again, we saved significant time when dubbing digital audio, which could be transferred to CD in a matter of minutes, as compared to an hour or more for making copies of analog tapes.

Using digital audio data

We are still experimenting with the recently released Atlas.ti 5.0 with respect to the coding and manipulation of digital audio data. For now, we have foregone complete transcription and have opted instead to code or flag the audio data and (when necessary) partially transcribe and save these files as memos linked with the recording segment. As the cost and timing of transcription no longer determine what we consider to be “useful data,” we have found that we are using our data differently and possibly even more efficiently in its digital form. For example, we can now analyze intervention sessions shortly after they take place, which allows us to examine group dynamics and other information that had previously been gleaned only from individual interviews. We have also started to record and analyze planning and update meetings on the project, information that we had never before used as data.

Perhaps the biggest change in our research has not just been what we are choosing to analyze but how that process evolves. Anyone who has ever listened to the recording of an interview and then read a transcript of the same interview can attest to the fact that it is a very different experience. Actually hearing an individual’s speech patterns (such as pauses, emphases, intensity, and tonal changes) produces a completely different “quality” of data than the written word alone. This can be a benefit for some researchers but also a disadvantage for those who find they are more visually oriented; strategies for finding a balance between the audio and the written versions are important to consider.

Overall, the cost of switching to digital audio required an initial outlay of approximately U.S.$1,500 (in software and hardware), as well as dozens of hours spent exploring the various electronic components and figuring out how they could all work together. In the end, this has been a worthwhile investment in time, energy, and money.