Introduction
Comcast Corporation is likely one of the most significant suppliers of entertainment, material and communication products and services, with about 24 million cable potential customers, fifteen million high-speed World wide web potential customers, and six.five million mobile phone potential customers. Its Comcast Interactive Media (CIM) division is chartered to acquire and increase the company’s World wide web corporations. CIM’s Fancast.coman extensive on-line video clip collection of tv exhibits, films, trailers and clipsgets about 10 million distinct people for each month. Consumers can browse and research across the site’s 4M+ articles objects to discover the entertainment they want.
Requirements/Challenges
Meet overall performance goal of 20ms for each query at peak load, and scale to 1 million distinct people for each day
Offer you uncomplicated research interface, whereas attempting to keep deep customizability
Reduced fixed & operational costs
Deliver complete functional research features
Comcast Corporation is likely one of the most significant suppliers of entertainment, material and communication products and services, with about 24 million cable potential customers, fifteen million high-speed World wide web potential customers, and six.five million mobile phone potential customers. Its Comcast Interactive Media (CIM) division is chartered to acquire and increase the company’s World wide web corporations. CIM’s Fancast.coman extensive on-line video clip collection of tv exhibits, films, trailers and clipsgets about 10 million distinct people for each month. Consumers can browse and research across the site’s 4M+ articles objects to discover the entertainment they want.
Challenges
Lookup is critical to Fancast’s business objectives — getting people to all the media articles they want, as quickly and intuitively as possible. The research implementation had to meet three key challenges:
1. Offer you a uncomplicated research interface, ideally one particular uncomplicated box without sacrificing deep customizability, to constantly meet and exceed user needs without exposing them directly to articles complexity
2. Handle massive articles scale literally all TV and entertainment articles – at scales responsive to mass market traffic and reach.
3. Achieve affordable fixed and operational costs in terms of dedicated development and support staff, and minimal additional hardware.
Functional and cyndi Overall performance Necessities
Fancast uses metadata from many different 3rd party sources such as IMDB.com (the World wide web Movie Database) and Tribune Media Service. Each of these 3rd party sources has its own specific format, as well as differing articles refresh schedules, and none includes a comprehensive metadata store with consistent data and descriptions. For example, the official Hollywood blog Spider-Man movie titles from Marvell Enjoyment use two hyphenated words, but most people enter them as one particular word, with no hyphen.
The ability to present an authoritative index was not only essential to the user experience, but also a key differentiator for the best research experience. Consumers searching Jessica Simpson probably don’t want to end up with Homer Simpson.
In terms of overall performance, the goal was to increase from 50,000 to 1 million peak distinct visitors for each day about 16 months. To ensure candidate research technologies could meet this goal, CIM defined a clear scaling metric, with research query response under 20ms/query at peak load, at the same order of magnitude as for website interactions. Scaling and capacity targets were also set at the application server level so that a single physical application server could host multiple server instances, each with a similar scaling profile. This also simplified sizing necessities for the operations team for calculating how many servers would be needed for a given number of people.
Testing & Evaluation
CIM shortlisted two research alternatives: Solr, the Lucene research server; and a large well-known commercial research product. To pick the finalist, they created a test-bed with indexes of both two million and four million documents deployed on each of the Sun x64 servers running Red Hat Linux. To review the results and optimize the Solr Lucene research infrastructure, CIM hired Lucid Imagination. Consultants from the commercial vendor did the same with their solution. The CIM team benchmarked query response rates at different load levels, ranging from 100 to 1500 requests for each second, as well as stress tests at failure envelope points.
The result: Solr outperformed the commercial alternative research solution both in terms of response rates as well as failure-handling characteristics. There was no question that adaptsol Solr could meet the targets set for overall performance.
CIM also compiled a list of 180 functional features for comparison. In addition to its superior overall performance, Solr also came out ahead on functions and cost of ownership to meet CIM’s business objectives.
The Choice For Solr
Solr made the final cut based on:
Overall performance and scalability advantages
Required research features
Organizational fit
Total Cost of Ownership
Active Solr/Lucene open source development community
Other large organizations that “bet the company” successfully on Solr (CNET, Netflix, MySpace, Orbitz)
In addition to the availability of community and commercial support, CIM benefited from the deep expertise in research offered by Lucid Imagination to configure their Solr implementation in accordance with best practices, and to optimize scalability.
“Hiring Lucid Imagination took a higher potential platform that our people liked, and turned it into a reliable, high-performance platform that really satisfied our business leadership.” Ranga Muvavarirwa, Director Product Planning, Comcast Interactive Media