If you receive errors when attempting to view this white paper, please install the latest version of
Adobe Reader.
"Compuware Acquires Proxima Technology. In January 2007, Compuware acquired Proxima Technology. Compuware IT Service Management provides an end-to-end view of application performance while helping communicate the business value of IT services, and proactively identifying and resolving problems."
Source : Compuware
Reducing Operational Risk of IT Service in Finance
IT Service Management (ITSM) is also known as :
Service Mgmt Software,
ITIL ITSM,
ITSM Configuration Management,
Operational IT Service Management,
ITSM Software Solution ,
IT Service Management Suite,
ITIL Compliance,
It Management,

Information Technology Infrastructure Library,
ITIL,
Information Technology,
ITSM Framework,
Management Information Systems.
Implications for CIOs of the New Basel Accord
Key Concepts Discussed
- Impact of Basel II on IT system services.
- Reducing operational risk through quality management.
- Quantifying and measuring risk with Six Sigma.
- Providing an IT risk scorecard.
- ContentsKey Concepts Discussed
- Contents
- Version 2.1: August 2002 (Revised March 2003)
- About this Document
- Management Summary
- Factors Affecting Risk
- Risk Management and the New Capital Accord
- Basel II and Financial services providers (FSP)
- The Impact of Electronic Banking
- Implementing Basel II With Centauri
- Step One: Understand the Current Operational Risk
- Step Two: Report Risk Through the IT Scorecard
- Step Three: Manage Risk
- Step Four: Track Internal Data
- Step Five: Allocate Capital For IT Risk
- Summary
- Appendix
- Failure Modes and Effects Analysis
- FMEA Example: Electronic Banking
About this Document
This white paper is intended for people who are accountable for IT system service in financial institutions of
G10 countries affected by the Basel Committee on Banking Supervisions New Capital Account (Basel II).
This includes CIOs, chief information security officers (CISO), and IT risk managers from banks, as well as
client executives, solution architects, and operation managers from financial services providers (FSPs). This
paper describes the risk management and capital requirement issues raised by Basel II, which is expected to
take affect during 2006 and to be tested throughout 2005. Proxima Technology provides a real-time IT
service measurement and management capability that utilizes Six Sigma to quantify and reduce risk of loss
resulting from inadequate or failed internal services of IT applications.
This paper deals with the following issues:
- The risk management issues raised by Basel II need to have solutions in place during 2004
(allowing sufficient time for testing).
- The challenges of managing e-banking environments, many of which have been highlighted by the
Basel Committee?s Electronic Banking Group (EBG).
- Improving quality of IT service in critical to quality business areas to lower operational risk and
eliminate the cost of poor quality.
- Providing an adequate management and reporting mechanism; a real-time IT risk scorecard with
quantitative risk indicators and metrics.
Management Summary
Introduction
An estimated $12 billion has been lost in the financial markets since 1992 through poor risk management
and fraud. In 2001, the Basel Committee on Banking Supervision issued a second set of proposals - the
New Capital Accord (Basel II) - to handle risk management throughout financial services. Basel II factors
operational risk into the calculation of total capital requirements, widening the scope of the first Accord
dramatically and impacting IT and operations departments within financial services providers (FSPs).Thus,
operational risk exposure will become central to financial businesses and FSPs, with the potential to
adversely affect earnings and shareholder value.
This document outlines a strategy for addressing the operational risk management requirements of Basel II
as they relate to IT service provisioning. While management processes are commonly established in
financial institutions to manage credit risk, market risk, and security risk, little is available about quantifying
and managing risks arising from IT service disruption. As the implementation deadline for Basel II looms,
this is cause for concern.
Proxima Technology addresses this issue by utilizing standard risk management and quality improvement
techniques from the Six Sigma quality management process. These techniques are supported in Proxima
Technology?s software system for measuring and improving IT service quality -Centauri Business Service
Manager (Centauri). With Centauri, the IT services that support critical to quality (CTQ) business processes
are defined. The operational risk of each service is declared using failure modes and effects analysis (FMEA)
and the associated risk is indicated by cost of poor quality (COPQ). Since these techniques are fully
supported by Centauri, operational risk measurements are provided in real-time and on a historical basis.
Clearly, with such knowledge, the financial institution is able to take the necessary steps to reduce risk
exposure and better comply with Basel II.
As Six Sigma is already an established industry standard for quality management in finance, the benefits of
using Six Sigma over a proprietary method are significant. There is a ready supply of people in the market
who can use these techniques, a standard approach is more likely to gain the acceptance of the Basel
Committee and other related industry bodies, and there exists tools (such as Centauri) that utilize these
techniques in day-to-day operations.
Proxima Technology was the first IT vendor to realize the significance of Six Sigma in IT systems
management. Consequently, Proxima leads the field in utilizing this approach to dramatically improve the
quality of IT system service and lower operating risk, including:
- Establishing the current risk of service defects, specified in cost-to-the-business terms;
- Identifying areas of adverse exposure to risk that require attention;
- Carrying out improvement (with Six Sigma quality management techniques);
- Providing a real-time IT risk scorecard that provides quantitative risk indicators and metrics.
Centauri is not a rip and replace solution. Rather, it retrieves data from the existing IT systems management
infrastructure, analyzes it to show current service, alerts management to problems that represent risk to the
business operation, and supports a process of improvement necessary to minimize risk. Centauri is a tool
already used by leading financial institutions for service level management and reporting. These
organizations will be able to utilize Centauri to address Basel II with little (or no) additional investment.
Factors Af fect ing Risk
Risk Management and the New Capital Accord
Basel II will ensure that sufficient risk aversion strategies are in place that increases the soundness of the
local financial system by aligning regulatory capital requirements to the underlying risks in the banking
business. Since the Accord will be adopted by the central banks for each participating G10 country, the code
is mandatory. The Capital Accord currently in effect was released in 1988 and sets out provision from risk
associated with credit and fraud. A substantially revised version, issued in 2001, accommodates the
significant changes in the banking industry brought about by deregulation, increased competition,
technological innovation, and other factors, all of which are collectively referred to as operational risk.
Although the new Accord is still undergoing change, full implementation by financial institutions of G10
countries is expected by 2006. For a CIO, this means that the key principals must be in place during the
latter part of 2003 to allow a sufficient period of testing before Basel II goes into effect.
Basel II has been greatly simplified over the previous version and is more flexible to give incentive for better
risk management. Basel II defines operational risk as:
"The risk of loss resulting from inadequate or failed internal processes, people, and systems or from
external events."
Although much of Basel II is concerned with the calculation of the ratio of capital to risk weighted assets,
the management of capital allocation techniques, and the market disclosure requirements, the impact on IT
is undeniable: IT system service is a significant factor in this calculation. Thus, severe complications arise
for a bank where IT failure occurs, since the resulting state of the business is not aligned with the risk
provision. On the business side, Basel II specifically cites business disruption, data loss, and security
breaches arising from system failure as event types to be covered. Consequently, a CIO must ensure
adequate tools and procedures are in place that:
- Provide an accurate assessment of the operational risk arising from IT system service (both
internally and externally provided);
- Measure and report such risk in real-time (IT risk scorecard);
- Identify technical problems affecting service and provide information, allowing technicians to deal
with problems quickly and efficiently and prioritorized by risk to the institution;
- Highlight IT system service defects that lead to unacceptable risk to the business;
- Reduce risk through ongoing service improvement.
As a tool for measuring and reporting on the IT system service quality, Centauri is able to address Basel II
by facilitating the definition of risk, establishing exposure degrees, and then presenting critical operational
risk metrics in a manner that better aligns IT operational risk assessment with the institution?s overall
operational risk management program. Clearly, there is a direct link between the quality of IT service and
exposure to risk: the better the quality, the lower the likelihood is of loss. The relationship between Basel II
and the quality management practices supported by Centauri are shown in Figure 1. Centauri uses the Six
Sigma quality management method - fast becoming a standard in the finance sector - to bring about
necessary improvements and eliminate recurring defects from IT services. Six Sigma provides a
standardized measure and target for quality, a management process of improvement (called DMAIC), and a
series of techniques that help practitioners isolate problems and investigate solutions. By supporting Six
Sigma, Centauri addresses both strategic improvements and tactical problem resolution. Through its
problem identification and alert notification capabilities, Centauri provides a reactive solution to tactical
problem solving, but with Six Sigma, also supports a proactive approach. That is, rather than simply reacting
to faults, IT services are also identified where service shortfalls negatively impact the business, allowing
steps to be taken to address these problems.
Service improvement is carried out by practitioners who use Centauri throughout the Six Sigma DMAIC life
cycle. DMAIC corresponds to the stages of the Six Sigma project, each with defined tasks and deliverables,
and is an acronym for define, measure, analyze, improve and control. The same measurement data that is
displayed in the dashboard is now used to understand the cause of recurring defects and form the basis of
hypothesis testing. Because this data is collected in real-time, the speed with which practitioners can initiate
improvement is greatly increased. This overcomes a common frustration with other Six Sigma tools that
require data to be manually loaded into an external statistical analysis package.
Basel II and Financial services providers (FSP)
The impact of Basel II on FSPs is not insignificant, but will likely be greeted with a mixed reaction depending
on the vision of the FSP. Leading FSPs will see this as an opportunity for new business. Stragglers will view
it as a burden full of unwelcome control.
Operationally, the FSP is required to put in place a management process that includes risk definition,
measurement, and reporting as described in the previous section. As a supplier of service, the FSP?s own
capabilities will be factored into the evaluation of risk. As such, Gartner Inc., advises their clients to review
the previous three years of records that shows FSP downtime and other internal risks. The assumption
being that past performance provides an indication of future capability. This will create a challenge for those
FSPs whose reporting process is poorly executed. If that is not problematic enough, an unfavorable outcome
of such a review may result in expense to the FSP to address quality of service and reporting issues. Where
the FSPs performance requires a significant amount of funds to be held in reserve, the FSP may have their
contract terminated altogether.
However, for leading FSPs, Basel II brings with it significant opportunities:
- A new advisory business service that opens the door to new clients;
- An additional business service that enhances a current relationship;
- A chance to demonstrate business value and enhance the partnering relationship with the client;
- An industry shake up that will eliminate weak FSPs who create noise in the market place.
The Impact of Electronic Banking
Because of the proliferation of Internet banking, special attention has been given by the Basel Committee to
e-banking. A sub-committee, the Electronic Banking Group (EBG), was formed in November 1999, and their
initial findings on risk management and supervisory issues arising from this were published in October 2000.
This report outlined and assessed the major risks associated with e-banking: strategic risk, risk to
reputation, operational risk (including security), and credit, market, and liquidity risks.
EBG noted the key characteristics of e-banking, citing the following specific risk management challenges:
- The speed of change of technological innovation in e-banking has hit unprecedented levels and is
in dramatic contrast to application development and release over the previous decades.
Competition in the banking sector has intensified the management challenge to ensure that
adequate strategic assessment, risk analysis and security reviews are conducted prior to releasing
new e-banking applications.
- Web-based e-banking applications are typically integrated with legacy systems to allow direct
straight-through processing of electronic transactions. Although this significantly reduces the
opportunities for human error and fraud that are inherent in manual processing, it increases the
dependence on quality design, system interoperability, and operational scalability.
- E-banking significantly increases banks dependence on IT. This puts additional pressure on IT to
maintain adequate quality of service, as well as raising a myriad of issues when this service is
outsourced to 3rd parties - particularly those that may not be regulated.
- Routing secure transactions across the Internet, an inherently insecure environment compared to
private networks, raises many issues relating to security controls, customer authentication, data
protection, audit trails and customer privacy.
Implementing Basel II With Centauri
Centauri provides an IT focused solution for the Business Disruption and System Failure events that are
described in Basel II. Additionally, it can be integrated with tools that address the security management
issues to provide a complete solution to Basel II from an IT operational perspective. Such integration would
include retrieving security breach information from a 3rd party tool, as well as integrating with the
institution?s current reporting system.
This section presents an overview of the steps that can be taken with Centauri to minimize risk from IT
service breaches and shortfalls. The steps are:
- Understand the current operational risk.
- Report risk through IT scorecard.
- Manage risk
- Allocate capital for IT risk.
Step One: Understand the Current Operat ional Risk
The first step is to understand the current operational risk of IT service provisioning. To achieve this,
Centauri uses a quality management technique called failure mode and effects analysis (FMEA). (Refer to
the appendix to this paper for additional detail on FMEA). FMEA facilitates an understanding of risk in terms
of what problems inhibit IT service (failure modes) and what their consequences are (effects). Failure
modes are prioritized using risk priority number (RPN) - a multiplication of severity, probability of
occurrence, and detectability. FMEA is conducted, at least initially, on critical to quality (CTQ) business
processes: those that have the greatest impact on the end-user customer. To achieve IT/business alignment
and to provide a proper context for prioritorization, each failure mode is cross referenced against the
business processes affected. The consequences of risk (or exposure) are defined as the cost of poor quality
(COPQ). Like FMEA analysis, COPQ is a well-understood concept to the Six Sigma practitioner. It consists of
the total cost to the business arising from the failure and includes both visible (e.g., staff overtime
payments) and invisible costs (e.g., damaged reputation). Within Centauri, FMEA explicitly states from
where Centauri will get its measurement data. This is usually from existing systems management tools, such
as IBM Tivoli or Microsoft WMI.
By the end of step one, Centauri users have a fully prioritorized list of failure modes that provides an
accurate and detailed definition of the risk potential. Centauri has all that it needs to start automatically
measuring this service to establish the exposure to risk at any given moment.
Step Two: Report Risk Through the IT Scorecard
Measurements are taken automatically by Centauri using the FMEA definitions created during step one.
Information is retrieved for each component of the FMEA, correlated to establish the quality of service, and
then reported to interested parties through the IT scorecard - typically an HTML digital dashboard. The
scorecard provides both a real-time and historical perspective of quality. The historical perspective - risk
over time - provides the meaningful basis for setting the actual capital reserves. The real-time perspective
provides insight into the exposure at any given moment and thus gives technicians ample opportunity to
take steps to deal with any problem. In itself, this is risk mitigation, though it has little to do with Six Sigma
or strategic quality improvement.
The digital dashboard addresses the different interests of the users. At the highest level, failure modes are
aggregated to show exposure for the entire financial institution. This corporate view progressively breaks
down to show exposure by department, business function, individual failure mode, and, at the lowest level,
a specific IT component. The dashboard is complemented with other information forwarding mechanisms
within Centauri. These include summary Acrobat PDF formatted reports, automatically created and
dispatched as email attachments to appropriate parties. Problems representing severe operational risk can
be dispatched immediately as an email or pager message, as well as cause some visual change to a
dashboard, such as a red flashing display.
The dashboard itself can also be used to monitor the service provided by an external source ?— a key area of
risk cited by Basel II because the external source is not bound by the new Accord. Since Centauri requires
no software to be installed in the outsourced computing environment, monitoring can be achieved with
minimal disruption, thus overcoming a common objection that the monitoring process becomes intrusive. By
providing tangible measurement data about the service provided over time, a CIO is well informed when
dealing with problems or negotiating service contracts.
The measurement data is collected automatically from the available sources. This includes ERP applications
for business key performance indicators and systems management tools for IT metrics. Such data sources
include SAP ERP, Microsoft WMI, and Tivoli T/EC.
Step Three: Manage Risk
As stated previously: the higher the quality, the lower the exposure. This creates a tremendous opportunity
for the FSP as they have direct control over IT service quality. This is in contrast to other risk management
techniques, such as market management, that have little control over the causes of risk. Such management
approaches only mitigate the consequences of risk. In almost all instances, exposure to risk in IT operations
is a result of poor management and reporting practices.
Risk management with Centauri is achieved both on a tactical and strategic level. Tactically, Centauri
identifies exceptional exposure to risk and alerts the appropriate people to the problem, highlighting the
cause of this risk. On a strategic level, Centauri automates a simple Six Sigma process that is more proactive
than simply alerting people to problems. Over time, Centauri measures service quality, highlights areas that
need attention, identifies the significant problem causes (Pareto charts), and then provides detailed
information that will help technicians bring about the appropriate improvement. The value of "6 sigma" (as
opposed to the Six Sigma process) refers to a measure of quality that equates to 3.4 defects per million
opportunities (DPMO). According to research conducted by Harry and Schroeder, the industry average in
commerce is 4-sigma - 6,210 DPMO. Obviously, the required capital allocation between financial
institutions operating at 4-sigma as opposed to 6 will be very different.
Step Four: Track Internal Data
In step with managing risk, Centauri provides a mechanism for tracking internal loss event data, which helps
an institution tie its risk estimates to its actual loss experiences. This is a critical component of Basel II, and
is a method for validating a risk measurement system in relation to actual operational loss events.
According to Basel II, "Internal loss data is most relevant when they are clearly linked to a bank's current
business activities, technological processes and risk management procedures" Centauri creates these
important links, tracks the data as it emerges from disparate sources, and integrates that data into the
institutions reporting systems. This creates a clear record of historical data that is vital to the success of a
Basel II compliant risk management program.
When Basel II is fully implemented, internal operational risk measures must be based on a minimum five-
year observation period of internal data. However, Basel II does allow a three year "data window" for when
an institution first moves to an Advanced Measurement Approach. Given that many large banks will have to
comply with Basel II by late 2006, it is important that those institutions implement a proven data tracking
system today.
Step Five: Allocate Capital For IT Risk
At this point, the financial institution has a clear understanding of the current risk, expressed in business
cost terms. A real-time digital dashboard has been set up to improve decision-making and reduce the time
taken to make said decisions so that their impact is quickly seen. Recurring problem areas can be shown in
reports that set the scope, priority, and objectives of service improvement programs - the objectives being
defined in cost reduction and business improvement terms.
Centauri also has a detailed database with all of the qualitative and quantitative information required to
establish the capital funds required against this risk. The final calculation for capital allocation is then
performed according to the institution's prescribed methodology.
Note: Centauri has the flexibility to allow an institution to input its prescribed measurement methodology,
and allow users to input firm specific calculations under the Advanced Measurement Approach. Proxima
Technology is in discussion with its business partners to create templates for statistically based
methodologies, such as Loss Distribution Analysis.
Summary
By aligning regulatory capital requirements to the underlying risks in the banking business, the New Capital
Accord ensures sufficient risk management strategies are in place to increase the soundness of the
international financial system. A key factor in this new definition arises from business disruption and system
failure. Proxima Technology provides a software system - Centauri - that addresses this operational risk
from an IT perspective. It does so by utilizing Six Sigma practices to quantify, measure, and report on risk.
Identified risk is then reduced through a step-wise quality improvement process. Consequently, financial
institutions that use Centauri will experience a reduction in operational losses related to IT system failures.
By utilizing a standard like Six Sigma instead of relying on a proprietary method, Proxima Technology?s
approach to risk management is proven. Furthermore, since Six Sigma is a standard in the finance industry,
there will be a ready supply of practitioners. This not only greatly reduces Basel II implementation costs, but
also significantly decreases the timescales associated with its rollout.
Basel II has been adopted by the central banks of each participating country. Consequently, the code will be
mandatory for the majority of internationally active financial institutions. It is expected to be fully enforced
in large financial institutions of G10 countries by late 2006. For a CIO, this means that the key principals
must be addressed during 2003 in order to have a sufficient period of testing and data collection before the
New Capital Accord goes into effect.
Proxima Technology utilizes proven risk management and quality improvement methods, such as Six Sigma,
to reduce losses associated with operational risk and inefficiencies. Proxima Technology?s Centauri Business
Service Manager works within distributed computing environments to provide the appropriate service level
measurement, reporting, problem notification, and data tracking to help financial institutions monitor and
reduce operational inefficiencies, and in doing so, make themselves more ready to face Basel II.
Appendix
Failure Modes and Ef fects Analysis
Failure modes and effects analysis (FMEA) is a technique that identifies and then eliminates the risks
inherent in the execution of a process. FMEA can be used to analyze any process and is ideally suited to
defining the operational risk of IT system services. Failure modes are analyzed to determine the potential
effects on the process and its causes for failure. The potential problems are then prioritized using a risk
priority number (RPN) before developing an action plan to reduce this risk.
In brief, the FMEA works using a five-step process:
- Identify the process to analyze.
- List potential failure modes that could arise in the process.
- Rate each failure mode using a scale from 1-10 for severity, probability of occurrence, and detectability.
- Multiply severity, probability of occurrence, and detectability to show risk priority number (RPN).
- Develop action plans to reduce risk according to RPN.
Centauri uses FMEA as the means to define an IT system service catalog. That is, when using Centauri, not
only does FMEA provide a statement of risk, but it also establishes a fully working measurement, root cause
analysis, and reporting mechanism that greatly limit the effects of the failure through alert notification.
FMEA Example: Electronic Banking
The following simplified example illustrates FMEA for a bank?s online bill payment process. See Figure 2 for
illustration. The FMEA steps required to construct this model are as follows:
- Identify a process or item to analyze, typically a business process:
- In this case, the bank?s online bill payment process.
- List all potential things that can go wrong - the failure modes:
- For example: the website is unavailable, the website is slow, or the user is not recognized.
- List the potential consequences of each failure - the affects. In this example, the process immediately
affects the customers:
- Website is down: a client who cannot access his/her bank account would become frustrated.
Or potential customers looking to apply for a credit card or open a savings account can instead
access a competitor with a click of the mouse.
- Website is slow to respond: time plays a big factor in client satisfaction; if a transaction takes
over 10 seconds to complete, the customer generally becomes agitated and dissatisfied.
Statistically, 30% of visitors will leave the website if it exceeds 8 seconds to load a page and
70% will leave dissatisfied if it exceeds 12 seconds.
- User not found: if a registered client?s user name and password is not found while trying to
access their account details, the client will complain. Moreover, operation costs to the bank
will incure if the client chooses to go into the branch instead of the website.
- Rate the severity for each potential failure. This is based on a scale from 1-10 where 1 is low severity
and 10 is very high, resulting in serious impact on the bank?s profitability. The same rating is applied for
probability of occurrence and detectability. Detectability measures the existing system?s capability to
detect failure. This method of rating is influenced by past history data, test data, or past experience by
the user. Potential causes are evaluated based on current systems in the process.
- In this example, if the website is not available, the cause could be a web server crash, a failed
router or firewall, or a failed alert to the appropriate administrator.
- The current design controls describe the current systems used to avoid potential failure modes.
- For example, poor enterprise systems management tool or a non-reliable alert notification.
- RPN is the risk priority number used to assist the process owner or manager to prioritize problems;
energy and resources should be focused on reducing risks that have greater affects on the bank?s
bottom line. Therefore, the higher the risk value, the more priority should be given to it.
- For the bank, the RPN for the unavailable website is 270, user authentication problem is 224,
and slow response time is 180. Cleary, the unavailable website should get priority because the
impact on the business is the greatest. The next risk to manage would be authentication
problem and then, lastly, slow response time.
- Recommended Actions is a plan of action to reduce the risk of the failures occurring.
- For example, unavailable website:
- Find an alternative network and systems management (NSM) solution to reduce the
frequency of server crash or network components failure.
- Implement a reliable alert notification tool to minimize impact from web application
availability.
About Proxima Technology
Proxima Technology, Inc. provides software and services to improve
business service and accountability through service-level measurement,
reporting, and problem notification in distributed computing environments.
www.proxima-tech.com
Telephone
Australia 02 9458 1700
Germany 040 32005-405
United Kingdom 0870 870 0732
United States 720 946 7200
©1998-2003 Proxima Technology, Inc., Centauri, and Centauri Busines s
Service Manager are trademarks of Proxima Technology. Specifications are
subject to change without notice. All other brand or product names are
trademarks of their respective owners.
Tim Young
Proxima Technology.
© 2002 Proxima Technology Inc.
Version 2.2: August 2002 (Revised June 2003)