The benefits of automating IT operational processes are well understood and hard to ignore. These include improving application service levels, increasing operational efficiencies, incorporating best practices, improving IT agility and proving IT accountability to the business. However, until recently, implementing automation to achieve these benefits has proven costly and difficult to maintain.

 

Today, IT organizations have access to a better method of making essential tasks and processes more efficient - an IT automation platform.

 

Technologies that automate data center functions are garnering a significant amount of attention from the industry. Given the recent high-profile acquisitions in the space and the perspectives among the analyst community, it’s becoming clear that IT automation technology will play a major role in IT and data center strategy going forward. In fact, Gartner believes that through 2012, of all IT operations management tools, investments in automation technology will have the highest return on investment (ROI).1

 

Current State of the IT Management Model

 

In any midsized or large enterprise environment, IT has expanded to serve as the backbone for most business services, and has become increasingly complex as a result. Combating this complexity, IT decision-makers have struck back by adding multiple management tools and systems within specific organizational silos to solve the problem du jour. This medley of tools, each with its own infrastructure and little or no interoperability between them, has compounded the management issue.

 

Today, many enterprise IT departments are forced to deal with a myriad of point tools that manage very narrowly defined functions - a costly and inefficient scenario. Ironically, this mix of tools has tended to increase the management burden instead of alleviating it.

 

Further complicating matters, most of today’s mission-critical applications operate on multiple tiers, touching a range of servers and operating systems, and in many cases, multiple databases. The operational processes for supporting these applications typically span multiple organizational boundaries. Additionally, most mission-critical applications are highly customized and unique to each individual environment. As a result, there are no dedicated point tools to automate the processes that support these key applications.

 

Because the existing tool infrastructure was not conceived with these complex applications and processes in mind, administrators have written scores of unmanaged shell scripts in an attempt to make tools more efficient or fill the gaps where point tools don’t operate. Typically, there is no institutional policy regarding the writing and proliferation of scripts (small applications developed in interpreted languages such as PERL, Korn, VBscript, etc.), so in most organizations, they are created ad hoc by an administrator or engineer and then placed onto the servers where the script needs to perform its function. It’s not uncommon to see dozens of scripts on every server in an enterprise.

 

For many organizations, this adds up to tens of thousands of these unmanaged script files distributed throughout the environment. Script failures are daily occurrences consuming even more resources to track down the cause of failure, then perform workarounds. Additionally, since many of these scripts contain embedded permissions in clear text, the organization is potentially exposed to security and compliance vulnerability issues.

 

Legions of staff perform innumerable human-to-machine interactions orchestrating scripts and tools across database, operating system and organizational boundaries. Each of these interactions represents an opportunity for error.

 

Most IT executives concur that the traditional “systems management” model for IT has run its course and has actually become a barrier to scale with today’s complex composite applications. The current model is built on the premise of people reacting to incidents or problems. Tools have been purchased that predominantly serve to provide these operational support staffs with data that indicates operational anomalies, but these staffs are now inundated by so much data from these tools that they spend most of their time trying to determine what’s relevant and important.

 

The current IT management model is rife with error resulting from the excessive number of human touchpoints involved as well as the proliferation of rogue scripts. And the costs associated with this model are astounding - typically millions of dollars in software licenses, which amounts to an ever increasing budget tax from annual maintenance fees as well as the personnel costs related to large support staffs.

 

Autonomic Management: A New Way to Manage IT

 

The emerging state of the art for IT process automation enables enterprise IT organizations to completely transform from the current reactive management model built upon legions of tools, staff and scripts to a model built upon autonomics and tailored to the needs of individual business applications. By using an IT automation platform to “marry” autonomics to the business applications, the application and its underlying computing infrastructure become self-healing, self-configuring and self-optimizing for all but the rarest of conditions. The support model in this case is reduced to a lean staff of talented individuals who analyze and diagnose conditions that appear for the first time, then model those conditions into the autonomic system so they don’t reoccur. In the end, an autonomic approach not only improves application service levels significantly, but does so while simplifying/consolidating support staff, tools and scripts.

 

Consider the case of a data warehouse (DW)/business intelligence (BI) process at a major communications service provider. In this particular case, the DW/BI process was built around billing data - more specifically, call records. The process involved numerous steps beginning with moving data from several sources into a data staging area, then performing quality assurance (QA) to ensure that the data was complete. The next step in the process was to initiate extract, transfer and load (ETL) processing to transform and normalize the data. After the data was transformed, it was moved into a single data warehouse where additional ETL processing took place in order to move subsets of the normalized data into purpose-built data marts so that business units could perform their analysis. These included data marts for bill generation, fraud analysis, network capacity planning and marketing programs.

 

This end-to-end cycle involved hundred of steps across several databases and about twenty servers. While the ETL and BI steps were automated, much of the remaining overall cycle was performed manually. Unfortunately, this process invariably failed, making the data in the data marts obsolete or completely inaccurate. There were many causes of these failures: human error; missing execute permissions in a script; expired passwords; an extract process fired off against an unavailable data source and insufficient processing capacity to perform the ETL crunching in the allotted time, to name a few.

 

On any given day, the twenty-person support staff spent their entire day orchestrating the overall process and working to assess the location of the most recent breakdown, then recover from it in time to ensure that the data didn’t become stale.

 

By deploying an IT automation platform, this company was able to implement an autonomic orchestration of the entire end-to-end DW/BI process and significantly reduce failures. Experience had taught them that one of the leading causes of failure was related to unavailable data prior to firing-off the ETL process, so they modeled into their process an automated routine to verify the presence and accuracy of source data prior to beginning the ETL. Similarly, they built an autonomic routine to assess the available computing capacity and then provision appropriate capacity to ensure that the ETL process could complete in the allotted time frame.

 

This enterprise used a sophisticated IT process automation platform to completely automate the end-to-end DW/BI cycle and embed autonomics, thereby making the process self-healing and self-optimizing in all known scenarios. The end result is that the data in the data marts is now timely and accurate, and the support staff around this process was reduced from 20 to three people.

 

The Value Hierarchy

 

An IT automation platform can provide an IT organization with three distinct value propositions: integration (also referred to as run book automation or RBA); orchestration and autonomics. RBA has emerged as a tactical solution to deliver automated IT process workflows and integrate disparate tools across IT processes. RBA solutions are seen as an integration layer between current management tools and have been predominantly sold as out-of-the-box automation for common problem and change-management processes. This corresponds nicely with the Information Technology Infrastructure Library (ITIL) movement, as it is primarily concerned with the common practices around supporting IT infrastructure.

 

Beyond integration is orchestration. Orchestration implies processes that cross technological or organizational boundaries. One example is a daily checklist process used to ensure that the core computing environment is properly prepared before employees arrive at work each morning. This process might involve verifying the performance conditions of the computing hardware, checking data files for availability and accuracy and verifying presence of certain applications processes. Typically such a checklist would involve a few staff personnel closely coordinating with each other and kicking off scripts within their own organizational silos in the necessary sequence. When an IT automation platform is used for orchestration, entire processes can be automated regardless of technology or organizational boundaries. This tends to dramatically reduce instances of human error that would otherwise be associated with the typical cross-silo coordination process.

 

The highest value of IT process automation is to completely transform a reactive management model into one based on autonomics. The concept of autonomic computing, made possible through an IT automation platform, is that the systems themselves should identify anomalies and be self-healing and self-optimizing.

 

Autonomic management driven by the needs of business applications is the model of the future; it can reduce inefficiency, increase application uptime and allow IT staff to concentrate on tasks and projects that directly impact the bottom line.

 

To realize this future of autonomic management, organizations need to view IT process automation technology as a “fabric;” a platform that spans organizational silos and is tightly coupled to business applications and the underlying computing infrastructure. This fabric provides a common method for performing all kinds of tasks across multiple machines - or applications - eventually reducing organizations’ reliance on supporting a myriad of tools.

 

Required Elements of an IT Automation Platform

 

Essentially, an IT automation platform is an environment within which IT organizations develop, administer and execute autonomic operational processes. This automation model is a superset beyond what RBA provides and can address the very complex processes required to automate production processes generated by core applications.

 

One example of a related concept is Websphere or Weblogic. These are infrastructures or environments used to develop and implement Web services applications in the enterprise. Similarly, an IT automation platform is an infrastructure or environment for developing and implementing autonomic services or automated IT operations and production processes across the entire enterprise. Specifically, such a platform must include a rich development environment to create autonomic services, a central repository and administrative environment to allow best practices around automation, and finally, a sophisticated run-time environment for executing autonomic processes.

 

Rich Development Environment

 

A robust autonomic model or solution should include a set of prebuilt automated tasks and open interfaces that allow for rapid development of complex automated processes and easy integration into virtually any existing system or tool in an enterprise IT environment. Most organizations using an IT automation platform can automate complex processes with a 70 to 80 percent reduction in the actual coding it would take to automate those processes through traditional scripting.

 

Another important element of a complete autonomic solution is a flexible graphic designer. This makes it easy to implement sophisticated logic, including variable passing, conditional testing and multithreaded parallelism, enabling straightforward modeling of even the most complex IT operational and production processes. This is an important difference relative to the more traditional approach of RBA. Many RBA tools focus on out-of-the-box automation for the most common IT support processes, such as server administration and related maintenance tasks, but they do this at the expense of the agility necessary to automate more complex processes that are unique to any organization’s mission-critical applications.

 

Best Practices Administration Environment

 

Another key element in a complete autonomic solution is a central repository for all automated processes. This central repository eliminates the need to install scripts on each individual machine. Instead, the system performs run-time distribution, blasting appropriate actions out to target machines and deleting them after they have completed their jobs.

 

A central repository makes it easy to initiate changes. IT managers modify only one reference flow, rather than logging on to dozens of machines and editing the same script on each machine. This makes it easy to organize and search for existing flows and minimizes the chance of building duplicate processes. What’s more, over time the IT staff can build up a library of reusable templates, further shortening the design process. This repository is also helpful for improving organizational knowledge of the various processes - no more having the knowledge just reside in the head of the person who wrote the script.

 

Using a common platform to automate all operations and production tasks also allows IT managers to centrally control and change all environmental variables, including permissions. This greatly reduces errors associated with change processes and helps ensure that security and compliance measures are not compromised.

 

Sophisticated Run-Time Environment

 

Since organizations use IT automation for processes crucial to the well-being of their mission-critical applications, it’s important that the automation process services be highly recoverable, because they will inevitably fail from time to time - usually because of changes in the environment.

 

A good autonomics solution provides sophisticated run-time monitoring of automated processes, informing operators of the precise state of execution of any run-time instance. If a particular step fails to complete as expected, the operator has an immediate understanding of where and why it failed and can take remedial action on that particular process instance by creating workarounds or recoveries. Ironically, most of the classic RBA products give little insight into run-time status and provide no ability for run-time intervention. Additionally, an autonomics solution must provide post-execution audit trails to allow back-in-time views of everything that occurred in the environment. This is particularly useful for organizational learning and in regulatory compliance.

 

Benefits of an IT Automation Platform

 

Rather than applying IT automation technology at a level of abstraction to other tools, implementing an automation platform closer to the applications and their underlying systems allows IT managers and administrators to experience autonomic capabilities for many complex IT scenarios. The net effect is that for everyday anomalies that are merely nuances in how applications perform, these autonomic services would proactively make the necessary adjustments to return systems to an optimized state without burdening reactive systems-management tools and their associated staff. In fact, over time, with an automation platform, most enterprises will be able to decrease their reliance on many of the current point tools, thereby saving a significant amount of annual maintenance dollars and eliminating the superfluous staff necessary to administer these tools.

 

To achieve competency and best practices around automation, enterprises will be best served by selecting an automation technology that’s applicable as a platform across the entire enterprise. While there are opportunities for automation contained within each functional silo of IT, the ultimate value of automation is achieved when operations processes that span functional silos are automated. IT managers may initially justify and implement an automation investment around solving a silo-specific problem, but their technology evaluation should be one which ensures applicability across the enterprise as the organization’s creativity and awareness regarding automation grows.

 

Reference:

  1. David Williams. "It Operations Run Book Automation Experiences COntinued Market Growth." Gartner, Inc., 2007.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access