Pages

Tuesday, February 9, 2010

What is Clustering .........



-> What is clustering?
     In computers, clustering is the use of multiple computers, typically PCs or workstations, multiple storage devices and redundant interconnections, to form what appears to users as a single highly available system. Cluster computing can be used for load balancing, parallel processing as well as for high availability. Clustering approach can help an enterprise achieve 99.999 availability in some cases. One of the main ideas of cluster computing to the outside world is that the cluster appears to be a single system.
       

       Clustering can be implemented at different levels of the system, including hardware, operating systems, middleware, systems management and applications. The more layers that incorporate clustering technology to more complex the whole system is to manage. To implement a successful clustering solution specialists in all the technologies (i.e. hardware, networking, software) are required.
     
-> History of Clustering
     If we peep in the history of the clustering, it’s says that the first commodity clustering product was ARCnet, developed by Datapoint in 1977. ARCnet wasn't a commercial success and clustering didn't really take off until DEC released their VAXcluster product in the 1980s for the VAX/VMS operating system. The ARCnet and VAXcluster products not only supported parallel computing, but also shared file systems and peripheral devices. They were supposed to give you the advantage of parallel processing while maintaining data reliability and uniqueness. VAXcluster, now VMScluster, is still available on OpenVMS systems from HP running on Alpha and Itanium systems.  The history of cluster computing is intimately tied up with the evolution of networking technology. As networking technology has become cheaper and faster, cluster computers have become significantly more attractive.

-> Typical Architecture of Clustering
      A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource A node:

  •   A single or multiprocessor system with memory, I/O facilities, & OS
  •   generally 2 or more computers (nodes) connected together in a single cabinet, or physically separated & connected via a LAN
  •   It appears as a single system to users and applications
  •   provides a cost-effective way to gain features and benefits



Three principle features usually provided by cluster computing are:
1) Availability
Availability is provided by the cluster of computers operating as a single system by continuing to provide services even when one of the individual computers is lost due to a hardware failure or other reason.
2) Scalability
       Scalability is provided by the inherent ability of the overall system to allow new components, such as computers, to be assed as the overall system's load is increased.
3) Simplification.
The simplification comes from the ability of the cluster to allow administrators to manage the entire group as a single system

->Types of Clusters
Basically there are many types of clusters or we can say clusters can be categorized in many ways, I would classify them into four basic category.
1)     
a)   a) High-availability (HA) clusters
High-availability clusters (also known as Failover Clusters) are implemented primarily for the purpose of improving the availability of services that the cluster provides. They operate by having redundant nodes, which are then used to provide service when system components fail. The most common size for an HA cluster is two nodes, which is the minimum requirement to provide redundancy. HA cluster implementations attempt to use redundancy of cluster components to eliminate single points of failure.
There are commercial implementations of High-Availability clusters for many operating systems. The Linux-HA project is one commonly used free software HA package for the Linux operating system..
2) 
   b) Load-balancing clusters
Load-balancing is when multiple computers are linked together to share computational workload or function as a single virtual computer. Logically, from the user side, they are multiple machines, but function as a single virtual machine. Requests initiated from the user are managed by and distributed among all the standalone computers to form a cluster. This results in balanced computational work among different machines, improving the performance of the cluster system.
3)   
c    c) Compute clusters
Often clusters are used primarily for computational purposes, rather than handling IO-oriented operations such as web service or databases. For instance, a cluster might support computational simulations of weather or vehicle crashes. The primary distinction within compute clusters is how tightly-coupled the individual nodes are. For instance, a single compute job may require frequent communication among nodes - this implies that the cluster shares a dedicated network, is densely located, and probably has homogenous nodes. This cluster design is usually referred to as Beowulf Cluster. The other extreme is where a compute job uses one or few nodes, and needs little or no inter-node communication. This latter category is sometimes called "Grid" computing. Tightly-coupled compute clusters are designed for work that might traditionally have been called "supercomputing". Middleware such as MPI (Message Passing Interface) or PVM (Parallel Virtual Machine) permits compute clustering programs to be portable to a wide variety of clusters.
4)  
      d) Grid computing
    Grids are usually computer clusters, but more focused on throughput like a computing utility rather than running fewer, tightly-coupled jobs. Often, grids will incorporate heterogeneous collections of computers, possibly distributed geographically, sometimes administered by unrelated organizations.

-> Some of the business applications of the clusters are

Google search Engine
Petroleum Reservoir Simulation
Protein Explorer.
Earthquake simulation
Image rendering and more

Cheers,
All the Best ..... :) 

No comments:

Post a Comment