I have spent lots of time at the intersection of real-time data and technology. Like 20 years, in fact. Things that absolutely amazed me on Wall Street – real-time stream processing, massive databases of well-structured and indexed data, co-location – amaze me no more. The funny thing is, however, that much of the stuff that Wall Street figured out two decades ago is making its way into the real-time web. It’s kind of like “back to the future” – we had glimmers of the future back in the 1980s as Wall Street pioneered techniques and technologies that are now being applied to the ad markets as well as other real-time environments. TIBCO? Instinet? These are the companies that should be given much of the credit for showing us the possible. And the merely possible is now a reality.
Data is accumulating at such a rapid rate that batch processing for many functions is grossly inefficient if not prohibitively costly. The amount of big iron – or cloud capacity – needed to analyze and index a day’s worth of a decent sized ad exchange’s billions of impressions is mind-boggling. Nobody in their right mind would take this approach to data analysis. Instead, data is being analyzed on the fly with massive amounts kept in memory, eliminating the need for the scale of compute required to run terabytes of accumulated data though a batch process. Where did we learn this from? Wall Street. Why? Out of necessity. The main difference between then and now is that quants who run models against historical data don’t need to run them overnight, walk away, and come back 10 hours later to get the results. They can often get real-time or near-real time number crunching done on a massive scale by using the power of distributed computing, an approach that simply didn’t exist on a commercial basis back in the late 1980s and much of the 1990s.
I see this trend being played out among the most sophisticated advertising technology companies as well as new entrants in the search space. Data is being treated as bits flowing across an exchange, both actionable in real-time while being indexed and stored for historical analysis and algorithmic development. At infinity, the data architectures and analytic frameworks of Wall Street and the real-time web will become one and the same. The requirements of each will be identical. But is important to remember our roots. Wall Street showed us the way. Make no mistake about that.