An Introduction to Distributed Systems

Keet Malin Sugathadasa
Sep 25, 2017
7 min read

With the increasing number of workload and processes happening inside a typical system today, it is hard for a single machine to just process all these and still be available to cater for more. This is where Distributed Systems come into place. Distributed systems have become the new technology that runs the computer industry. There are many reasons as to why distributed systems are very important and and demanding in today's applications. This blog contains the basic information required to understand what distributed systems are and how they work. Some of the basic concepts are described separately for the reader's understanding. Following are the topics being addressed in this blog.

What is a Distributed System.
Why do we need Distributed Systems
Advantages and Disadvantages
Distributed System Communication
Basis of Distributed Systems
Parallel vs. Distributed System
Centralized vs. Distributed Systems
Fallacies of Distributed Systems

What is a Distributed System

A distributed system is a software system that interconnects a collection of heterogeneous independent computers, where coordination and communication between computers only happen through message passing, with the intention of working towards a common goal. The idea behind distributed systems is to provide a viewpoint of being a single coherent system, to the outside world. So, the set of independent computers or nodes are interconnected through a Local Area Network (LAN) or a Wide Area Network (WAN) as depicted below. If it is a local area network, it might be connected through twisted pairs, coaxial cables or fiber optics. But, wide area networks are implemented using satellite communication techniques. The media access protocols that these systems use could be Asynchronous Transfer Mode (ATM), Ethernet and so on.

Now let's have a look at some differences between general multi-core computers and distributed systems. Unlike multi core computers, distributed systems are said to be more loosely couple in terms of hardware. Each node in a distributed system is a complete computer, with a full complement of peripherals. Also, the nodes of a distributed systems are widely spread all around the world, and not necessarily locating in close proximity. Another important factor that needs to be highlighted is that, distributed system nodes may run in different operating systems and each can have its own file systems.

According to Leslie Lamport, the definition of Distributed System is that, "A system is distributed if the message transmission time T(m), is NOT negligible to the time between events- T(e), in a single process." The meaning of this is that, every process takes some time to process an event. So, the time between two events T(e) is the time taken to execute a certain event. What Lamport is saying is, the time taken to pass a message between one system to another, is significantly greater than the event execution time. This is depicted in the image given below.

The time T(e) depends on the processing power of the nodes. The time T(m) depends on the network connectivity within the LAN or WAN.

So, why is this definition given by Lamport is important? The importance of this inequality is in the design of algorithms that are gonna span across the nodes in a network. Since the message transmission time so significantly larger, than the event computation time on a single machine, the distributed applications we develop, has to have algorithms structured in a way that it takes a greater time than the time taken to pass a message. With this only, we will be able to reap the benefits of parallelism in distributed computing, where the system won't waste most of the times to pass messages here and there.

The applications that run on different nodes in a distributed system is known as distributed applications or distributed programs. Distributed programming is the process of writing such distributed applications. Distributed computing is where we use distributed systems to solve complex tasks.

Why do we need Distributed Systems?

You all have used Facebook right? If not, consider a system like Google. These systems provide services to users everyday. there are millions and millions of requests coming in every hour. So, what if it was all handled by a single machine? Just imagine the impossibility and the limitations that Google and Facebook would have to face.

One solution is to use a custom machine with very high specs, but the problem is that, it will be very expensive, might have a lot of limitations and there are scalability issues. So, the best option is to divide the work amongst some computers that can communicate with each other and get the work handled as needed. If a machine fails, the others will distribute the work accordingly, to ensure that every request is being processes. This is a major advantage when it comes to distributed systems.

Some of the major reasons as to why distributed systems are important, is given below.

1) Some systems are inherently distributed

If we look at current systems which provide services like file sharing, crowd sourcing, mobile networks and IoT devices, make the systems inherently distributed

2) Some systems are too big for a single system

As in the example given above, when applications and systems get bigger, it is hard for a single machine to manipulate everything and manage all the workload. So it is required to go for a bigger system or a distributed approach to cater to all the incoming workload.

3) For Scalability

Applications like Google, Amazon, YouTube, need to expand itself with time, where scaling up and down is important at times. Distributed systems allow ease of scaling the system.

4) For better Quality of Service and Quality of Experience

When the workload increases on a single machine, it is important to manage the network and the performance of the system. If the user gets a poor service and poor performance, then there is a major issue. For example, if the Youtube Videos start flickering due to jitter, the users might feel uncomfortable watching the videos. Therefore having a distributed system is necessary.

5) For Reliability

In computer science, the availability of machines can never be fully guaranteed. If we have a system of single machine, and if that machine goes down, then the entire system would fail. Due to this reason, moving into a distributed environment is very important. In distributed systems, even if a node fails, still the system can manage and balance the workload as needed.

6) Economic Reasons

Computers getting powerful every year and the cost is also rising with this. So, if we are to manage a high workload on a single machine, we need to have a powerful machine, which in turn will cost you a lot. Going for a distributed system is very economical where you can have multiple cheaper nodes to perform a given task.

7) System Evolution

During the early days, computers were very similar and connecting each other was not a big issue. But now, many computing devices have come into play, where heterogeneous computers are already a part of the eco-system. Distributed systems provide the capability to connect multiple heterogeneous operating systems, allowing all the machines to function as a single unit.

Advantages and Disadvantages

Advantages:

Resource sharing: Sharing hardware and software resources
Openness: Flexibility of using hardware and software of different vendors
Concurrency: Concurrent Processing to enhance performance
Scalability: Increased throughput by adding new resources
Fault Tolerance: The ability to continue in operation after a fault has occured

Disadvantages:

Complexity: They are more complex than centralized systems.
Security: More susceptible to external attack
Manageability: More effort required for system management.
Unpredictability: Unpredictable responses depending on the system organization and network load.

Distributed System Communication

If we look at a normal computer, the programs in concurrent applications communicate via a shared state or a shared memory. But the problem with distributed systems is that they have no shared memory between each other. So, the only communication mechanism is through message passing. If a node wants to communicate to another, it has to pass messages through the LAN or WAN. Some of the well known message passing techniques are pure HTTP, RPC-like connectors, and Message Queues.

What is a Message Queue

A message is a piece of information sent by one process to another in a network. In distributed systems, this is more like passing one message from on node, to another node. A message queue is a platform that allows senders and receivers exchange information. A sender will post messages in queues and the receiver retrieves the message from the queue. This is the main platform that allows the distributed systems to carry out their communication and coordination.

These are used to transfer data between components in a Distributed System. Each data segment is broken down into chunks where each chunk is being submitted to the message queue by the sender. This is called "Pushing". When the receiver wants to request information or chunks, it will "Pull" from the queue.

Basis of Distributed Systems

The basis of Distributed Architecture is its transparency, reliability, and availability. In this section let us focus more on transparency issues.

Parallel vs. Distributed System

The terms Parallel Computing, Concurrent Computing and Distributed Computing, cannot be clearly distinguished from each other. For example, a Distributed System might have a node that is running concurrently, and many nodes in the Distributed Network might be running in Parallel. In terms of the coupling of the machines, the two terms Parallel Systems and Distributed Systems can be related to each other.

The major difference between Parallel Computing and Distributed Computing is given in the below two points.

In parallel computing, all processors may have access to a shared memory to exchange information between processors.
In distributed computing, each processor has its own private memory (distributed memory). Information is exchanged by passing messages between the processors.

Centralized vs. Distributed Systems

Fallacies of Distributed Systems

These are assumptions that people make regarding Distributed Systems. Due to these reasons, many people get deviated from the real concept of Distributed Computing and don't see the true potential of it.