Tips for making good IVR recordings for business PBX systems.

Ideally, you should be using a professional recording company (the likes of Muzak) to record high-quality, business-class IVR messages, greetings, etc.

However, if you are determined to roll your own, here are a few tips.

Use an actual PBX phone handset. If you’re using a VoIP system like Asterisk, you can use the Record() application with a dedicated internal extension or something like that.

You will get better results that way than if you try to record sound files by outside means and convert them to the necessary results. Here’s why:

  • Direct-line handsets will have a certain level of ambient noise, hum, static, etc. associated with ordinary telephone conversation, but not the level of distortion one gets from the low-bandwidth codecs on cell phones, or calling in to record via the PSTN from a cheap analog handset and dealing with the realities of last-mile noise introduced by copper lines. So, at the other extreme, don’t record by calling in from the PSTN either.
  • Digital handsets still have the essential qualities of a traditional phone handset as an electromechanical device in that the spectrum and quality of recording the diagraphm in the receiver produces is most closely aligned with the 3.1 KHz that PSTN voice lines are traditionally engineered to bear. Thus, your output will most closely resemble the input that the playback needs.
  • There is always information loss involved in format conversion and quantisation. Asterisk, for example, has the ability to record in the native codec in which you are going to later issue the playback toward PSTN and VoIP callers (i.e. ITU-T G.711u), requiring no lossy conversion. Secondly, converting extremely rich, high-bitrate audio data with a high sampling rate to a codec of much narrower bandwidth (i.e. the Pulse Code Modulation used on G.711u, which is designed to be a digital bearer for PSTN-grade audio) and with logarithmic companding (the u-law / A-law part), can result in heavy content loss in the wrong places — this is why it’s hard to just take a high-quality MP3 and convert it to a telephony codec and expect your hold music to be discernable and pleasant to a PSTN caller. The conversion and companding process can mangle the desired psychoacoustic profile of the clip far more so than simply recording it in the right codec to begin with. It’s the general principle that if the bandwidth and content of the input does not differ so radically from the bandwidth of the anticipated output medium, less mangling and distortion will be introduced.
  • The sensitivity of the receiver in good handsets usually produces a more ample and adequate level of baseline volume, less likely to require upward boosting — and therefore less likely to introduce further distortions by amplifying the noise component of the signal.

Otherwise:

  • Speak clearly and distinctly into the receiver, while holding it at an angle with respect to your mouth that minimises the interception of breath.
  • Avoid subtle background noises, including sighing, chair creaking, tapping, clicking, etc.
  • Speak slowly and enunciate clearly - but not in such a caricatured way that your recording sounds like a preposterous departure from real human speech possibilities.
  • Definitely follow a prepared written statement. While this will, in most cases, alter the tone of your speech to suggest more slimy “professionalism” and less “sincerity,” it will also create even out the cadence of your speech, as well as prevent conspicuous pauses and hesitation while you remember or improvise.

Leave a Reply