|
Autonomic
Computing systems that are self-healing will not only cut
costs, but also ensure maximum system uptime, and automate
the management of increasingly complex systems. Brian Pereira
brings you an update on autonomic computing and its immediate
benefits
Autonomic
computing is an approach to self-managed computing systems
that will work independently
With
Autonomic Computing applications like server load balancing,
process allocation, monitoring power supply, automatic updating
of software, will become possible
Once
autonomous computing is adopted by enterprises, will service
engineers and network administrators become redundant?
 |
| According
to Van Symons of IBM, while autonomic computing
won’t put sys admins out of jobs, it will minimise the
number of people needed to do the more mundane tasks |
THE
human body is self-healing. Broken bones mend, cuts heal,
and a childs immunity system grows stronger. The bodys
autonomic nervous system, which controls involuntary actions
without conscious awareness or involvement, has fascinated
the world of medicine. So why cant it be the same with
computers? Must a computer engineer or a systems administrator
monitor a server round-the-clock to ensure normal operation?
The solution is autonomic computing systems that will have
the ability to configure, tune and even repair themselves.
Autonomic computing is an approach to self-managed computing
systems that will work independently, without human intervention.
Engineers in research labs around the world are now creating
autonomic systems and its not just for computers. Take
the automobile industry for instance. Engineers at Daimler-Chrysler
AG (manufacturer of the highly reputed Mercedes-Benz cars),
have been working on autonomic systems to ensure driver/passenger
safety, for many years. The fruits of their efforts are ABS
(Anti-lock braking system) and other safety innovations. (You
can
read more about these innovations at www.mercedes-benz.com/e/innovation/rd/).
ABS prevents the wheel from locking when the car goes into
a skid. This ensures the car can still be steered and thus
prevents accidents. The ABS comprises electronic sensors and
solenoid valves in the wheel hubs.
Scientists at Goodyear and Michelin (both tyre manufacturers)
have created Run-flat tyres that let you drive
safely for a few more miles (to the nearest repair shop) after
a tyre puncture. Run-flat tyres have a reinforced sidewall
that maintains some thickness in the tyre and thus keeps the
chassis level when the tyre is deflated. In addition, there
are sensors in all four tyres that relay information about
the air pressure to the dashboard of the car. The driver can
monitor the pressure levels in the tyres and take corrective
action when necessary. This feature could be automated in
the future. (You can read more about this at www.orbweb.net/autoshop247/tyre_talk/11_latest_trends.html).
Self-healing technology in computers is not a new concept.
Notable examples of this technology are ECC (Error-Correcting
Code) memory, SMART (Self-Monitoring, Analysis and Reporting
Technology) for hard disks, and fault-tolerant servers. Research
institutions are working towards making such technologies
more autonomous. In this sense, there will be minimal human
intervention, and computing sub-systems will be able to proactively
detect and rectify potential faults before any failure occurs.
A fully autonomous computing system does not exist today,
but such systems could make the concept of 24 x 7 x 365, or
99.999 percent uptime possible.
Initiatives
 |
| M
Ganesh of IBM India says the ultimate goal of autonomic
computing is to give businesses the ability to manage
systems and technology infrastructures that are hundreds
of times more complex than those in existence today |
Various
computer vendors and research institutions are involved in
autonomic computing, which is also referred to as self-healing
technology, holistic computing or introspective
computing. The technology is not only applicable to
servers, but also extends to databases, software applications,
and Grid Computing networks.
Perhaps the first elements of autonomous computing were software
agents that made waves around 1999. A prime example
is Computer Associates Neugents. According to CA, Neugents
look for patterns in data and can extrapolate from the patterns
to predict future events. Neugents, which are included in
CAs Unicenter systems management software and its Jasmine
object-oriented database, can look at up to 1,200 variables
and make sense of it all. Business data is one area in which
CA is pushing this technology.
Spiders
or software agents from search engines are another example.
Also called Bots, these agents scour the Web looking
for new websites and then return to the search engine and
update its database with the new URLs.
Windows XP also incorporates self-healing technology. When
an application crashes, the user can shut it down systematically,
thereby preventing the entire system from freezing or hanging.
This operating system also offers to report program errors
to the Microsoft Support team. Further, Windows XP looks out
for updates and automatically downloads these when available.
Recent versions of Microsoft Office include a Repair feature.
So if key program file (such as Winword.exe) gets corrupted
or accidentally deleted, the software can reinstall it. Such
features will soon be present in other desktop software.
Plug-and-play is another element of autonomous computing.
Plug in a new device to your PC and the system will automatically
detect it. The operating system will then fire up its hardware
wizard, which guides you through the process of installing
the appropriate drivers for the new device.
One
company that is actively working towards fully autonomous
systems is IBM, which has an initiative named Project eLiza
(See box: An update on IBMs Project eLiza). IBM has
incorporated some elements of eLiza (now in phase 2 of development)
in its servers. The company is demonstrating software called
Enterprise Workload Manager, which monitors groups of servers,
managing the machines and moving work between them without
the aid of human administrators.
An initiative similar to eLiza is Project Oceano. It will
enable a group of Linux servers to share jobs, and reassign
jobs when new servers are added or removed from the cluster.
Compaq is also pursuing autonomic computing. It is offering
a suite of tools collectively called Proliant Essentials.
The tool with autonomic characteristics is Compaq Insight
Manager. This software delivers pre-failure alerts for Compaq
ProLiant servers, thereby proactively detecting potential
server failures before they result in unplanned system downtime.
Another tool in the suite is ActiveUpdate, an advanced Web-based
application that provides proactive notification and automatic
download of software updates for all Compaq systems that range
from handhelds to servers.
Another example of an autonomic system is the Adaptive data
flow engine, a technology used to scan Deep Web
databases and collect information. Deep Web refers to information
on the Internet that cannot be found using traditional search
engines. The technology was developed by University of California,
Berkley associate professors Joseph Hellerstein and Michael
Franklin. They (along with a team of six students) have designed
Telegraph software, a data retrieval system designed to harness
streams of live data coming out of the Internet as well as
from networked sensors, software, and smart devices. Telegraph
does much more than traditional search engines. It fetches
data from Web-accessible databases, analyses it, does cross-referencing,
collates data and presents it all on one screen.
(You can read more about this development at telegraph.cs.berkeley.edu/).
 |
| Dr
Manoj Kumar of IBM India Research Lab says the most
immediate benefit of autonomic computing will be reduced
deployment and maintenance cost and increased stability
of IT systems |
Research
in autonomic computing is also taking place in labs at MIT,
University of Texas, University of Michigan and other universities.
Objectives
There
has to be a compelling reason for institutions to invest millions
of dollars towards autonomic computing research. Why do we
need such systems? What do researchers hope to achieve and
what would an autonomic system be capable of in the future?
A key reason for development is the management of complex
and disparate systems. Says M Ganesh, country manager, Enterprise
Systems Group, IBM India, Due to rapid advances, technology
is becoming accessible to more people everyday. Ironically,
this has led to a growing problem in the industry as we get
better technology with increasing price/performance, managing
those technologies has become a customers number one
problem.
According to Ganesh, the ultimate goal of autonomic computing
initiatives like Project eLiza, is to give businesses the
ability to manage systems and technology infrastructures that
are hundreds of times more complex than those in existence
today. This means computer systems should be able to
self-optimise, self-configure, self-protect and self-heal,
says Ganesh.
As enterprises expand IT infrastructure, they will require
skilled manpower to manage the complexity of interconnected
systems, each sophisticated in its own way. Organisations
that frequently expand IT infrastructure are confronted with
the problem of shortage of skilled manpower. Autonomic computing
can address this requirement because systems will self-manage,
and adapt to new situations without the need for administrator
intervention.
But once autonomous computing is adopted by enterprises,
will service engineers and network administrators become redundant?
Autonomic
computing wont put people out of jobs, but it will minimise
the number of people needed to do the more mundane tasks,
says Van Symons, IBMs global executive for Project eLiza.
Autonomic computing will raise the level of their positions
so that they will be setting the policy, and not just being
the equivalent of cable guys.
It may be some time before people feel threatened, but for
the moment it would be appropriate to consider autonomic computing
as a solution to the management of complex systems and the
answer to the shortage of skilled administrators.
Says Dr Manoj Kumar, director, IBM India Research Lab, Autonomic
computing has risen to the top of the IT agenda because of
the immediate need to solve the skills shortage, and the rapidly
increasing size and complexity of the worlds computing
infrastructure.
Kumar feels the goal of autonomic computing is to realise
the promise of IT: increasing productivity while minimising
complexity for users. Its time to design and build
computing systems capable of running themselves, adjusting
to varying circumstances, and preparing their resources to
handle most efficiently the workloads we put upon them,
he says.
The other driver for creating autonomic systems is the potential
benefits it presents.
Benefits
The main benefit of autonomic computing is reduced TCO (Total
Cost of Ownership). Breakdowns will be less frequent, thereby
drastically reducing maintenance costs. Fewer personnel will
be required to manage the systems.
According to IBM studies, approximately 80 percent of the
cost of major computer systems revolved around hardware and
software procurement earlier. Now, the cost of trained personnel
required to manage these systems is roughly equivalent to
the equipment costs. IBM feels the cost of personnel will
double that of equipment in the next five to six years.
The
most immediate benefit of autonomic computing will be reduced
deployment and maintenance cost and increased stability of
IT systems through automation, says Dr Kumar. Higher
order benefits will include allowing companies to better manage
their business through IT systems that are able to adopt and
implement directives based on business policy, and are able
to make modifications based on changing environments.
IBM also says that autonomic systems can reduce the time for
deploying new systems.
The
challenge for a customer today is that his IT infrastructure
is most likely heterogeneous, meaning its comprised
of hardware from multiple vendors. This makes it increasingly
difficult to add systems and manage them autonomically,
says Symons. Customers spend three-fourths of their
application deployment time and costs on the integration equation.
We need autonomic capabilities so that IT infrastructures
can be self-configuring, self-healing, self-optimising and
self-protecting.
Another benefit of this technology is that it provides server
consolidation to maximise system availability, and minimises
cost and human effort to manage large server farms.
Applications
Autonomic computing promises to simplify the management of
computing systems. But that capability will provide the basis
for much more: from seamless e-sourcing and Grid Computing,
to dynamic e-business and the ability to translate business
decisions that managers make to the IT processes and policies
that make those decisions a reality.
IBMs Dr Kumar says one of the first applications of
autonomic computing is e-sourcing, which is gaining traction
now. E-sourcing is the ability to deliver IT as a utility,
when you need it, in the amount you must have to accomplish
the task at hand. Autonomic computing will create huge opportunities
for these kinds of services, feels Dr Kumar.
Other applications include server load balancing, process
allocation, monitoring power supply, automatic updating of
software and drivers, pre-failure warning, memory error-correction,
automated system backup and recovery, etc.
One area where autonomic computing can contribute significantly
is Grid Computing. Grids, empowered with the self-managing
capabilities of autonomics can revolutionise computing. And
the applications are not just restricted to the IT industry
alone.
There are several Grid Computing initiatives underway. The
University of Pennsylvania, for instance, is building a powerful
grid that aims to bring advanced methods of breast cancer
diagnosis and screening to patients, while reducing costs.
The Grid is a utility-like service delivered over the Internet,
enabling thousands of hospitals to store mammograms in digital
form. The Grid will provide analytical tools that help physicians
diagnose individual cases and identify cancer clusters
in the population.
Another example is the North Carolina Biometrics Grid, accessible
to thousands of researchers and educators to help accelerate
the pace of genomic research that could lead to new medicines
to combat diseases and develop more nutritious foods to feed
the worlds population.
But autonomic computing development faces some challenges
too and it may be some time before we see its implementation
in applications like Grid Computing.
Challenges
Analysts
say the days of widespread autonomic computing usage are still
way off. However, we are beginning to see elements of it in
business systems (See box: An update on IBMs Project
eLiza). Meanwhile, the challenges faced in developing such
systems are mainly those dealing with the management of complex
systems operating in heterogeneous environments.
The other challenge is to convince customers that autonomic
computing actually simplifies systems management and can cut
costs in a manner described earlier in this article. IT managers
and administrators may be reluctant to give up control of
the systems they manage.
The transition to the new self-healing systems must cause
minimal or no disruption. Teething problems will only shatter
an IT managers faith in these systems.
Systems with autonomic capabilities (such as IBMs servers)
are already available in the Indian market. The next few months
will determine the acceptance of autonomic computing as such
systems begin to be deployed in enterprises. But we can look
forward to the days of highly simplified network management
and rapid systems deployment, thanks to the self-healing,
self-configuring, and self-tuning characteristics of the next
generation of computing systems.
|
An
update on IBM’s Project eLiza
|
|
Project
eLiza is IBMs autonomic computing initiative.
The ultimate objective of this project is to develop
hardware, software, and networks (total solutions)
that will be able to allocate computing resources
as needed, safeguard data, and ensure business
continuity in case of a disaster. Much of this
technology stems from IBM mainframes and mainframe
class servers (zSeries).
So serious is IBM about autonomic computing that
earlier this year it created an Autonomic Group
and appointed a vice president to head it. IBM
is spending approximately 25 percent of its server
R&D costs on developing Project eLiza and
related autonomic capabilities. Some of its R&D
efforts have already been incorporated into products
that are in the market.
We asked Van Symons, IBMs global executive
for Project eLiza, for an update.
When
we announced Project eLiza a little more than
a year ago, we started incorporating autonomic
capabilities at the infrastructure level. That
was Phase 1. Were now moving to Phase 2,
which is building autonomic capabilities on the
enterprise level, informs Symons.
And heres what IBM has achieved to date:
-
eLiza technology is incorporated in the IBM
pSeries, xSeries and iSeries servers. It manages
heterogeneous server integration and maintains
cross-server security.
-
eLiza e-business-management services match IT
resource availability with business requirements
to make sure business-performance levels are
met.
-
IBMs Intelligent Resource Director (IRD),
a self-managing operating system for the eServer
z900, allows the server to dynamically reallocate
processing power to a given application as workload
demands increase.
-
Chipkill technology, which is used in IBM mainframes
to virtually eliminate memory failures, is now
available on IBM eServer p620 and p660.
-
Software rejuvenation enables IBMs Windows
servers to reboot automatically when they sense
an impending problem that could crash the server.
-
eLiza technology is also incorporated in IBMs
Tivoli systems management software. It enables
Tivoli to use the information it gathers to
recognise its optimal configuration and fine-tune
parameters accordingly.
Still to come
This
year were incorporating an autonomic security
feature called Enterprise Identity Mapping (EIM),
which will allow a single log-on across an infrastructure.
Were also deploying a feature called Enterprise
Workload Manager, or eWLM, which makes a heterogeneous
infrastructure self-optimising. Its the
industrys first heterogeneous workload manager,
adds Symons.
Here are some Project eLiza objectives for
Phase 2:
-
The availability of eWLM by year-end. eWLM monitors
groups of servers, managing the machines and
moving work between them without the aid of
human administrators. The system operates on
UNIX, Windows and Linux server platforms, as
well as on competing servers from HP and Compaq.
Workload management, which is available for
IBMs mainframes, is being extended to
heterogeneous platforms.
-
The self-healing cellular architecture of Blue
Gene, a high-speed machine now under construction
at IBM Research, will detect failed processors
and redistribute work to compensate for their
loss without interruption. Blue Gene, a supercomputer
that will be used in genetic and molecular research,
will have many processors.
-
Tivoli Intrusion Manager, an integrated approach
to security that reduces the overall complexity
of security management.
-
IBM will form a consortium of experts from the
IT industry to help guide and shape the future
of autonomic computing. It will collaborate
with its thousands of partners in all areas
of the industry, as well as with leading national
and university labs around the world. IBM is
focusing its R&D efforts around the challenges
of autonomic computing.
-
IBM will distribute 75,000 copies of the autonomic
manifesto to the top technical and scientific
minds in the world.
-
IBM organised a scientific conference focusing
on autonomic computing in early 2002. It will
continue to host several conferences and symposiums
on progress and challenges, and continue to
encourage the best minds to actively participate
in research outside of IBM.
-
Company-wide, IBM is developing products that
contain key autonomic features through Project
eLiza.
|
|
|
Intel
builds ‘Autonomic Hooks’ into Itanium 2
|
|
The
recently launched Intel Itanium 2 processor is
future-ready for autonomic computing. Intel officials
explain that the new server processor has built-in
Autonomic Hooks. Intel calls the autonomic
capabilities Machine Check Architecture (MCA).
It allows the system to continue executing transactions
as it recovers from several error conditions.
A key feature of MCA is its ability to detect
and correct errors by allowing the process to
be recognised by operating system software and
other important elements of the server system.
MCA is capable of analysing data and responding
to it in a way that enables higher overall system
reliability and availability. Here are a few highlights
of this technology:
-
Firmware and OS involvement in correcting and
recovering from complex platform errors.
-
Well-defined flow for reporting and logging
errors to the operating system.
-
Extensive hardware error detection and correction
on all major data structures.
-
Extensive error management in hardware, firmware
and OS.
-
Prevents loss, corruption and downtime.
-
Built on an open and extensible framework.
-
High reliability, availability, serviceability
and manageability.
Machine Check Architecture occurs at three levels:
-
Operating System: The OS logs errors and initiates
recovery.
-
Firmware: Seamlessly handles errors.
-
Hardware: CPU and chipset offer extensive ECC
coverage and parity protection.
Since MCA is an open architecture, software developers
and OEMs can leverage this to build complete autonomic
systems. Developers can customise this to meet
the reliability requirements of systems that businesses
and organisations rely upon.
|
|
|