Hardware & Infrastructure
News
10/7/2005
12:55 PM
Darrell Dunn
Darrell Dunn
Features
Connect Directly
RSS
E-Mail
50%
50%

In HAL's Footsteps

Real progress is being made in developing IT systems that do a better job of monitoring, analyzing, and fixing problems without human intervention

Much of the underlying architecture is based on the Information Technology Infrastructure Library, a set of common practices used across areas such as change management, configuration management, and release management.

The deployment of autonomic-computing capabilities over the past year has let Carey Capaldi cut by 40% the time he spends manually digging through system-failure logs to understand why a problem happened. It also has let the product manager for the content-management system at Technicolor Creative Services create an automatic way to redeploy jobs that otherwise would be stalled for hours.

Technicolor Creative Services provides content-management capabilities to other business units within Technicolor, a major manufacturer and distributor of video tapes and DVDs, and for resale externally. Technicolor Creative Services--a subsidiary of Thomson, which provides technology and services to the entertainment and media industries--offers services like the management of media files, such as reels of film; encoding pay-for-view movies; and the creation of DVDs.

When Capaldi assigns jobs, a variety of events can trigger a failure and, historically, that has resulted in suspension of the job. In the majority of cases, once the failure is detected, the job can be restarted manually from the suspended queue and finished without further incident. However, many of the jobs had to run overnight, and if there was a disruption then, they could remain suspended until the problem was discovered the following day.

IBM contacted Capaldi and asked him to be a guinea pig in its autonomic-computing effort, using IBM's Autonomic Management Engine framework and Common Base Event, which monitors system resources, correlates information from various infrastructure components concurrently, and automatically determines the root causes of failures.

Using a log and trace analyzer tool, Capaldi can instantly gain access to custom logs that provide a detailed look at why failures happened. Taking such a look across his jobs to see specific points of failure saves time, he says. The real autonomic feature, however, is that the system can now resubmit the stalled job under specific criteria without Capaldi or his staff intervening.

Technicolor Creative Services traditionally has written a lot of in-house software to aid in its effort to archive and manage the large amounts of digital content it handles, Capaldi says. In the future, he plans to create specific log files in new software than can be optimized to work with IBM's autonomic tools.

"Right now, a lot of this has to be tailored to exactly how you work as a company, and it would be nice if it was more off-the-shelf," he says.

Capaldi is ready to move further down the autonomic path. "In a heartbeat," he says. "I think there's a ton of potential that hasn't been tapped yet. Over the years, I've worked with a lot of bleeding-edge technology that eventually just didn't go anywhere, but this is an industry where you need to push the envelope."

The president and chief executive of LAN Solutions Inc., Victor Kellan, agrees. The company, which provides network-management services, saw growing opportunities to provide remote monitoring to customers as a managed service. Through trial and error, it built a network operations center. But as LAN Solutions grew, it experienced difficulties in quickly scaling the center to handle growing amounts of data going through the system.

When a problem happened, depending on its type, location, and complexity, it could take experts from several different areas to parse through thousands of log entries from databases, applications, Web servers, operating systems, or other network devices to find the problem's starting point and then determine a course of action. Typically, problem resolution was a time-consuming task accomplished by several people, each familiar with a specific type of log file.

Previous
2 of 3
Next
Comment  | 
Print  | 
More Insights
The Business of Going Digital
The Business of Going Digital
Digital business isn't about changing code; it's about changing what legacy sales, distribution, customer service, and product groups do in the new digital age. It's about bringing big data analytics, mobile, social, marketing automation, cloud computing, and the app economy together to launch new products and services. We're seeing new titles in this digital revolution, new responsibilities, new business models, and major shifts in technology spending.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - August 20, 2014
CIOs need people who know the ins and outs of cloud software stacks and security, and, most of all, can break through cultural resistance.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.