SHARE

Grid Computing 101

Grid computing is a set of old ideas with new technology and business faces. Technically speaking, grid computing enables programs to be spread out over multiple computers via a network so that massive jobs can be done as efficiently as possible. Does that sound a lot like a traditional cluster to you? If it does, […]

Written By

Steven Vaughan-Nichols

Dec 29, 2004

5 minute read

Channel Insider content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Grid computing is a set of old ideas with new technology and business faces.

Technically speaking, grid computing enables programs to be spread out over multiple computers via a network so that massive jobs can be done as efficiently as possible.

Does that sound a lot like a traditional cluster to you? If it does, you’re right—grid computing is essentially clustering writ large.

There are, however, several important differences. With a grid, instead of having multiple processors bound together by a system bus or by a high-speed fabric such as iSCSI or Fibre Channel, a grid’s computers can be thousands of miles apart and tied together with conventional Internet networking technologies such as OC-3 (Optical Carrier-3, a 155.52 Mbps network technology) or even a lowly T1 (1.54Mbps)

In addition, in a cluster, the systems tend to all be the same. For example, the clusters I cut my teeth on in the early ’80s, DEC VAX-11/785s minicomputers running VMS 4.1, had identical hardware.

Today, it’s much the same. For example, typical examples of the popular Linux-based Beowulf clusters use commodity hardware, such as Pentium chips with standard network technologies like Fast Ethernet, for binding together inexpensive clusters.

Usually people use clusters for one of two purposes: HA (high availability) for greater reliability or HPC (high-performance computing) for faster processing. Indeed, according to the latest Top500 directory of top supercomputers, most of the fastest supercomputers—such as the current leader, IBM’s BlueGene/L—are actually clusters.

While a grid may have the same goals of HA and HPC, the component systems do not have to share the same architecture or operating systems. For example, with United Devices Inc.‘s Grid MP Enterprise, users can run grid applications across heterogeneous systems running 32-bit Windows and Linux on x86, AIX on POWER, and Solaris on SPARC.

Again, in some ways, this may sound like old hat to you. Distributed computing projects, such as SETI@home, have long enabled users running everything from OS/2 to HP/UX to Windows 95 to tackle small parts of huge jobs.

Click here to read about Microsoft’s Bigtop grid computing initiative.

With SETI and similar projects, the machines are dedicated to a single task. In a grid, resources can be shared dynamically to address multiple problems.

“The goal is to create the illusion of a simple yet large and powerful, self-managing, virtual computer out of a large collection of connected heterogeneous systems sharing various combinations of resources,” Viktors Berstis, an IBM software engineer, said in the IBM Redbook Fundamentals of Grid Computing (PDF file).

To make this happen, a grid uses a program that works in concert with the various operating systems to coordinate the efforts of various machines. Typically, the program enforces a set of standards and protocols to establish how a system shares resources. IBM, for example, uses those of the OGSA (Open Grid Services Architecture).

The heart of a grid is its job scheduler. With a scheduler, the system allocates resources to the various jobs. A system with good scalability will divide up jobs between a grid’s systems so that its computers will be used efficiently and won’t sit around idle.

There are several ways to do this. One is when jobs are assigned by “scavenging.” With this approach, idle machines signal the scheduler that they’re available for more work. In another approach, “reservations,” systems are preassigned to a schedule for efficient workflow. In practice, some grids use both combined with dynamic resource allocation.

With the last approach, more systems are brought in to deal with a problem as the workload increases. For example, if during the holiday rush, a credit card system starts to be overwhelmed; more systems can be called in to make certain that the charges keep flowing.

Next Page: Sharing drive space.

Grids can share more than just processors; they also can share drive space for both greater storage room and more robust data availability. Usually this is done with mountable networked file systems, such as AFS (Andrew File System), that old Unix distributed storage favorite; NFS (Network File System); or DFS (Distributed File System). The grid software, in turn, can provide virtual storage with an overreaching file system that spans both drives and file systems.

A grid also can be used to set multiple computers to work on a single problem. In such cases, grids must support IPC (interprocess communication) between programs running on different systems. Typically, grids that support such activity borrow MPC (massively parallel computing) message-passing models

Common supercomputer message-passing models that have been borrowed by grids are MPI (Message Passing Interface) and PVM (Parallel Virtual Machine).

Another advantage of this borrowing is that application developers don’t have to reinvent programs that can make use of a grid’s parallel computing resources.

To date, most grid systems, such as Sun Microsystems Inc.’s N1 Grid Engine and DataSynapse Inc.‘s GridServer, have been proprietary designs. Recently, however, open-source approaches based on Linux systems have gained popularity.

The most influential of these efforts, which has IBM’s backing, is The Globus Alliance. This consortium creates open-source tools, the Globus Toolkit 3.2, for building grids. Globus uses Java and Web services to help developers create grid-capable applications.

Globus supporters are not the only ones working toward an open source-based grid. Dell, EMC, Oracle and Intel are working on the Linux-based “Project MegaGrid,” which will run on Dell’s PowerEdge servers.

The business case driving all of these efforts is the same: Provide customers with a utility model for their computing needs. Hewlett-Packard’s Adaptive Enterprise, IBM’s On Demand Business and Sun’s N1 take different takes on grid’s central themes, but are all designed to provide low-cost computing power to customers.

This commercial aspect to grid is also relatively new. Traditionally, grids have been used in scientific and academic environments, where they shared the same jobs as its cousins, supercomputing and grids.

Now, however, as the technology has matured and open source has brought the price of grid development down, companies are taking it to the marketplace. In particular, financial companies—with their vast need for real-time processing—have become important grid customers.

So it is that as Microsoft prepares to launch its Bigtop, the Redmond giant will face several opponents with mature technologies.

IBM is currently the grid leader, according to financial services magazine Waters, with Oracle and DataSynapse following. Thus, this is one market battle where Microsoft will face a stern test.

Check out eWEEK.com’s for the latest news, views and analysis on servers, switches and networking protocols for the enterprise and small businesses.

Grid Computing 101

Steven Vaughan-Nichols

Company

Categories