Archive for the ‘Telephony’ Category

Tips for making good IVR recordings for business PBX systems.

Saturday, May 17th, 2008

Ideally, you should be using a professional recording company (the likes of Muzak) to record high-quality, business-class IVR messages, greetings, etc.

However, if you are determined to roll your own, here are a few tips.

Use an actual PBX phone handset. If you’re using a VoIP system like Asterisk, you can use the Record() application with a dedicated internal extension or something like that.

You will get better results that way than if you try to record sound files by outside means and convert them to the necessary results. Here’s why:

  • Direct-line handsets will have a certain level of ambient noise, hum, static, etc. associated with ordinary telephone conversation, but not the level of distortion one gets from the low-bandwidth codecs on cell phones, or calling in to record via the PSTN from a cheap analog handset and dealing with the realities of last-mile noise introduced by copper lines. So, at the other extreme, don’t record by calling in from the PSTN either.
  • Digital handsets still have the essential qualities of a traditional phone handset as an electromechanical device in that the spectrum and quality of recording the diagraphm in the receiver produces is most closely aligned with the 3.1 KHz that PSTN voice lines are traditionally engineered to bear. Thus, your output will most closely resemble the input that the playback needs.
  • There is always information loss involved in format conversion and quantisation. Asterisk, for example, has the ability to record in the native codec in which you are going to later issue the playback toward PSTN and VoIP callers (i.e. ITU-T G.711u), requiring no lossy conversion. Secondly, converting extremely rich, high-bitrate audio data with a high sampling rate to a codec of much narrower bandwidth (i.e. the Pulse Code Modulation used on G.711u, which is designed to be a digital bearer for PSTN-grade audio) and with logarithmic companding (the u-law / A-law part), can result in heavy content loss in the wrong places — this is why it’s hard to just take a high-quality MP3 and convert it to a telephony codec and expect your hold music to be discernable and pleasant to a PSTN caller. The conversion and companding process can mangle the desired psychoacoustic profile of the clip far more so than simply recording it in the right codec to begin with. It’s the general principle that if the bandwidth and content of the input does not differ so radically from the bandwidth of the anticipated output medium, less mangling and distortion will be introduced.
  • The sensitivity of the receiver in good handsets usually produces a more ample and adequate level of baseline volume, less likely to require upward boosting — and therefore less likely to introduce further distortions by amplifying the noise component of the signal.

Otherwise:

  • Speak clearly and distinctly into the receiver, while holding it at an angle with respect to your mouth that minimises the interception of breath.
  • Avoid subtle background noises, including sighing, chair creaking, tapping, clicking, etc.
  • Speak slowly and enunciate clearly - but not in such a caricatured way that your recording sounds like a preposterous departure from real human speech possibilities.
  • Definitely follow a prepared written statement. While this will, in most cases, alter the tone of your speech to suggest more slimy “professionalism” and less “sincerity,” it will also create even out the cadence of your speech, as well as prevent conspicuous pauses and hesitation while you remember or improvise.

GWT, EVA, and treating my NIHS.

Wednesday, January 16th, 2008

As I age and develop more and more pragmatic realisations about engineering project management and what it takes to make a task not only technologically, but economically viable, I find that I am growing more successful in my battle with a syndrome from which I suffer more than most people — Not Invented Here Syndrome, otherwise known as reinventing the wheel.

wheel.png

NIHS is a particularly bad disease to have concomitantly with strong purist sensibilities toward programming as an art and a discipline, not merely a tool for accomplishing certain ends in a functional, utilitarian kind of way. (The latter, with some notable exceptions, is the dominant conception of it in the world of business. It’s “programming,” you see, not “software engineering.”) This leads to a basic intuition that does make business sense: the fundamental need for frameworks, libraries, decomposition, and code reuse. The reasons may boil down to a dilution of the economic facts with considerations of elegance, artfulness, simplicity and conceptual integrity on a detailed level, the result is the same.

However, the interaction of these sensibilities with a strong case of NIHS is particularly harmful in that it leads one to assume the task of developing such tools entirely on one’s own. I cannot recall how many ambitious open-source projects of mine have gotten bogged down in the need to fulfill a prerequisite gap with APIs, libraries, toolkits, utility routines, interfaces, hooks, etc., and thus never shipped — never saw the light of day.

So, I am currently working on EVA (Evariste Voice Arbiter), a SIP-based hosted VoIP billing / mediation solution based on OpenSER. It is most immediate in the Evariste software product pipeline. I conceived of it back in August and had spent much of the fall building the backend1 (primarily OpenSER and PostgreSQL stored procedures) on top of my day job, taking care to provision decent business infrastructure to support it along the way, including a bug tracking/issue tracking/development workflow management system, a relatively detailed specification, and concrete, enumerated project milestones, and indeed, even some testing methodologies.

When I finally got to the part where I start working on the front-end - the GUI interface for the web application that binds the front-end to the backend, and the associated web services to provide those hooks - it was my natural inclination as an NIHS sufferer to conceive of building an in-house PHP/AJAX framework with high-level web service interface capabilities using JSON as a transport. I began to develop it slowly and aimlessly, calling it EvPHPTK, and even meticulously documented the API and developed unit tests for every component.

Earlier this year, despite entreatments to consider it by Storm, I rejected the use of Symfony as a development framework for the PHP front-end. I also generally have taken a very sceptical view of dozens of toolkits with prebuilt PHP and AJAX widgets, such as Dojo. This has less to do with aversion to frameworks as such and more with the fact that I simply found them aesthetically displeasing from the vantage point of my purism on various fine points.

111-gwt.gif

Long story short, I had some time to actually consider this in detail over the Christmas holiday and came to an epiphany that led me to throw away EvPHPTK and score a major goal against my NIHS. I am going to use Google Web Toolkit to build the front-end.

It’s a rather radical decision, all things considered, but I’d like to take a moment to elucidate why I think it is meritorious from a business and technical standpoint. It was not without considerable pain that I abandoned a darling premise of my software development methodologies and sold out to the exigencies of practicality.

The reality, is, however, that Evariste is not in the web application framework business. It is neither a product nor a core competency. And if EVA and the other ambitions I have for this business are to ever be realised, they are not going to materialise by sticking to my incumbent ways. I do not actually have time to write umpteen gazillion lines of JavaScript plumbing for XMLHttpRequest callbacks or intricate, application-specific DOM manipulation. I’m just way past any interest in that whatsoever. I don’t care.

For those not wholly familiar, GWT is a free web application development toolkit from Google that allows one to develop GUIs in Java, using API idioms and design patterns essentially familiar to those who have implemented GUIs in Swing or its legacy predecessor, AWT.

A cross-compiler then generates the appropriate JavaScript to perform all the necessary feats of “AJAXian” DHTML, DOM manipulation, and other interface mechanics. One of the key “selling points” of the framework is that this is done with a view toward the lowest common denominator of browser compatibility in this area, theoretically preventing one from worrying about the interoperation of the JavaScript with particular browsers or the handling of arcane browser quirks. To the extent that there is still room for incompatibility, it is going to be on the CSS side, as the framework requires all GUI elements to be styled outboard rather than nativising CSS manipulation via the DOM through its API routines.

The framework also implements a substantial subset of the core Java runtime environment (JRE) API, including support of most data primitives, collections, data structures, and associated manipulation classes. Thus, it is quite possible to accomplish many things programmatically within the GWT-powered application code apart from drawing the interface itself, to the extent that it is possible to accomplish anything of substance in Java on the application level.

In many ways, I loathe Java. What’s more, despite some experience with it (a large part of it in a second-level introductory CS course — the only one I ever took), I am not fundamentally a Java programmer by pedigree, and cannot claim any exceptional competency in it. However, I have come around to the persuasion that it is fundamentally good for one thing, and one thing only: building GUIs. This helps me see the value in GWT.

GWT does other neat things. It is very web-services oriented, and supports JSON readily and extensively. This makes it possible to readily couple it to a web services backend in another language; I certainly do not intend to do the web services backend in JSP (yuck!). Most likely the backend mediation will take place via a Perl dispatcher (with mod_perl). Thus, I have the satisfaction of knowing that neither the resulting generated code for the front-end nor the backend RPC callbacks will ultimately be in Java. The Java code ends up being more of a dialect - a descriptor - for the GUI than a runtime determinant of it.

Clint inquired why it is precisely that I do not consider using one of the PHP frameworks out there that accomplish similar things. The reasons are manifold; for one, I just do not like the frameworks that I have seen. They strike me as very incomplete in their functionality and design methodology, for one. I see one challenge of the development process handled very gracefully and artfully, and another one heavily stilted or left out in the cold entirely. A lot of them also seem very fly-by-night and whimsical, both in their essential conception and in their development history, and I am not endeared to the idea of adopting them as a dependency for a mission-critical commercial application.

Another big reason is that I am increasingly dissatisfied with PHP as a way to develop web applications. Rather unlike some of what’s been related to me in other jeremiads against PHP from folks like Brian or Jonathan, my reasons have little to do with its performance characteristics, scalability, its technical robustness, or compatibility issues.

It’s more that I just don’t see it in what is fundamentally a scripting language a way to design enterprise-worthy applications that appeals to me. If web applications are the new fashion, and nobody does standalone GUIs anymore, that’s good and fine;  please, sign me up for the revolution, rifle in one hand, constitution in the other. But in that case, I want to be able to harness the paradigmatic benefits of doing a standalone GUI inside a web application context, from the ground up, end-to-end. Why does the development process have to be reduced to scriptable particles inside HTML pages?  It’s high time I had my cake and ate it too.

In other words, I really shouldn’t have to write any HTML, use any templates, or provide bindings between static content and dynamic constructs at all if I am writing an application. That’s what appeals to me about using GWT; it lets me truly build a completely dynamic interface from scratch, and achieve a more wholesome, genuine separation of interface from underlying functionality — the ideal aim of all good programming.

It’s going to be a tough learning curve, for sure. Being neither a Java programmer nor experienced with GWT, I have little sense of design patterns or best practises or various other methodological intangibles that — unlike the basic learning of a programming language’s grammar and syntax — take years to achieve competency in. But I am very excited and optimistic about choosing GWT to drive my “web 2.0″ applications.

1 Yes, it is very sad that it takes me several months to do that. I used to be able to pull off such feats in a few days of solid, sustained work, but this isn’t high school anymore. :-(

Memo to mobile handset vendors: what about MY smug, self-important delusions of entitlement and consumer empowerment?

Monday, January 7th, 2008

Somewhere along the way, I became a relatively heavy user of mobile text messages. The geek factor probably plays a heavy role in this, as does the need to furtively communicate short thoughts while occupied at work. A girlfriend — and some friends — with a proclivity for the epistolary probably doesn’t hurt either.

textmsg.jpg

Even so, I’ve never had a fancy mobile handset with a built-in QWERTY keyboard except for work-related purposes. In various jobs I’ve held, I’ve had a T-Mobile Sidekick II, a Blackberry, etc. For the most part, I’ve had to contend with banging them out on fairly conventional low-end flip phones; the kind that come as a free upgrade on a contract renewal with most major carriers.

Despite the awkward word entry scheme and seeming user interface limitations, I’m actually fairly comfortable with it and have gotten decently efficient at it. My gripe is not with the fundamental paradigm. I know that if I really wanted to type a lot of words into a mobile device, I should just get a PDA or a sophisticated data/voice portable of the variety mentioned above.

Here’s what I’d like to see in the SMS interface implementations in common consumer-grade handsets, in my ideal world:

  1. I employ proper capitalisation, punctuation, spelling, grammar and symbols in my messages. The interface should not be designed to discourage that. I want easy access to hyphens, semicolons, percentage signs and ampersands, not to navigate through fifteen menus to introduce them and derail my train of text-thought.
  2. If you’re going to make a good predictive dictionary (”T9″/”T9Word”), make a good predictive dictionary. I cannot count the number of things I have to switch to cumbersome manual entry for. Why aren’t most proper nouns, including names of famous people, places, or things, in the dictionary? Countries like Kazakhstan, people like Mobutu Sese Seko or Kim Jong-Il or Jean-Paul Sartre, or trade names like Pyrex? Solid-state memory gets cheaper and cheaper by the day, folks.
  3. I do not like how entering a wrong letter that forks off the predictive tree leaves me having to re-enter the entire word as opposed to merely correcting the mishap in the ending, etc. It seems that if I am trying to spell “something” and accidentally enter “somethhg,” I cannot delete three characters and recover the intended ending — even if it is one that would otherwise be a reachable leaf in the predictive tree.
  4. Why does the interface fail to make the entirely reasonable assumption that the next character within a word boundary following a period (punctuation symbol: .) ought also to be capitalised? By default, mixed-capital entry mode forces the outcome to “She shells sea shells. by the sea shore.”
  5. As a technology professional and a business enthusiast, I employ a lot of acronyms. But you don’t have to live in an acronym-ridden day-to-day world of banality to use them. I don’t expect the dictionary to know acronyms. But please don’t try to squash or mangle them into something other than what they are actually intended to be based on some erroneous assumption that normal people don’t use acronyms. Yes they do.
  6. This latest phone I have requires the navigation of far too many menus for the simple purpose of sending a message.
  7. Do we really need a 160 character payload limit in the SMS protocol specification? Maybe it’s just that the methodologies of writing for these presidium.org people have slowly rubbed off on me, since they pay me by the word, but there is very little of communicative significance or value that I can say in 160 characters or less.
  8. The predictor needs to be a lot better at learning my entry habits, and knowing which word I tend to use most of the time when entering sequences that map to multiple permutations.  95% of the time that I enter 7666, I mean “soon,” not “room.”
  9. The predictor needs to meticulously assimilate every new word I manually enter into its dictionary. It should automatically know when I have switched out of predictive and into manual entry mode to compensate for its ignorance and take careful note of what I am proceeding to type. Sometimes it does this, sometimes it doesn’t.
  10. The lexicon of the predictive dictionary should be cool, hip, trendy, with-it and modernity-affirming, and come prebuilt with words like “blog.”

That is all. k00l? thx 4 listenin, k… u 2 bye.

No, I am not the one!

Friday, December 14th, 2007

I’m not trying to call anyone out, but, I got an employment solicitation from a recruiter a few days ago that may strike some of you, who heard me rant about fighting with the BroadSoft in a previous line of employment, as slightly ironic:

I would like to inform you that we have an opportunity to add additional personnel to our engineering staff. We are specifically trying to find a Broadsoft VoIP Expert. Are you the one?

Santa María, ¡que Dios me bendiga! No, I am not the one. I am never touching that piece of equipment again. Or hopefully anything else like it.

OpenSER 1.3: Should be interesting,.

Wednesday, December 12th, 2007

Bogdan-Andrei Iancu says: “As scheduled, tomorrow will be the official release of OpenSER version 1.3.”

I’m pretty stoked. Every point release of OpenSER has brought with it new changes that make it ever more useful to me in my implementational work, additional useful modules, and other things that together conspire to make it a great SIP proxy.

Here is a pretty good rundown of the essential differences.

Incidentally, I am not the only one, after all, to have picked up on the fact that
avp_db_query() - an obscure function implemented by the avpops module - may be the single most useful thing to have made its appearance in OpenSER in recent memory. Someone else has too.

Before avp_db_query(), for all the things one could do with OpenSER, direct interaction with an RDBM via SQL and using the returned tuples in call processing logic was not one of them; the best one could do was use one of the existing database schemas, such as the table used for storing key/value pairs for avpops, or perhaps the lcr (Least Cost Routing) module. Incidentally, EVA (Evariste Voice Arbiter), the billing mediation platform, is built atop OpenSER and heavily relies on this database interface.

UPDATE 2007-12-13: Here are the complete 1.3 release notes.