Good Morning, Myself Divyesh Tailor Today We will talk about our final year project. Which is Clustering with Raspberry pi . we will talk about "Why we Choose this project, brief explanation, how everything works and some limitations"
we Choose this project for 2 main reasons , To learn and expriment the nature of parallization and effectiveness of SBC or Single board computers like Raspberry pi, Odriod Xu4 and Asus ThinkerBoard.
So what is a super computer
The general defination states that a super computer is a computer with a high level of performance compared to a Normal computer.
The speed of a super computer Is measured in FLOPS or Floting Point Operation per second.
One of the example of of super computer is Titan Super computer.
Titan is equipped with 18,688 CPUs And GPUs which gives Speed of 17.59 PetaFlops and Theoretical Peak of 27 Peta Flops.
A Normal computer runs at about 80 GigaFlops.
which basically means creating a bunch of computers connected via LAN to Share and Process data at High Speed Using Parallization.
Parallelization is the act of designing a computer program or system to process data in parallel. Normally, computer programs compute data serially: they solve one problem, and then the next, then the next. If a computer program or system is parallelized, it breaks a problem down into smaller Segments that can each independently be solved at the same time by discrete computing resources. When optimized for this type of computation, parallelized programs can arrive at a solution much faster than programs executing processes in serial.
This project uses 4 raspberry Pi's which makes it 4 node cluster.
so after reading the name of the project a question comes into your mind which is "What is cluster?"
The Defination of cluster states the following :- "A cluster is a group of computers that are connected with each other and operate closely to act as a single computer. Speedy local area networks enhance a cluster of computers' abilities to operate at an exceptionally rapid pace."
In this project we Used raspberry pi 3 Model B+ because it runs 64bit quad core ARM based processor running at 1.4Ghz, 1Gb of LPDDR2 Ram ,Onboard bluethooth and Wifi , 100mbps ethernet port and much more in just 35USD which is 3300INR.
It is also Small as a size of credit card and conusme less power.
Here are Some Detailed Specifications of Model B+
Now the Operating System of Raspberry pi
Raspberry pi has its own OS called Raspbian
Raspbian is a free operating system based on Debian optimized for the Raspberry Pi hardware. An operating system is the set of basic programs and utilities that make your Raspberry Pi run. However, Raspbian provides more than a pure OS: it comes with over 35,000 packages, pre-compiled software bundled in a nice format for easy installation on your Raspberry Pi.
External network i.e LAN.
Mpich is Based on MPI, A message passing Interface which enable nodes to send and recieve messages and data as well as run programs in Parallel process.MPICH supports c++,fortan,python.
Mpich is Way For nodes to communicate and share data between them.
This is the Basic Cluster Design
Master node and compute node will be connected to ethernet switch. All Compute Node are Connected to Master node via Seperate Internal Network and Master node is connected to external network.
We Seen some usage of the This cluster but for now we only plan to benchmark the cluster so see how well it perform.
So here linpack benchmark is used to to test the performance of system. It is measured in FLOPS.
This aims to approximate how fast a computer will perform when solving real problems.
So what is flops You might ask. Flops or Floating Point Operation Per Second is a measure of computer performance which is useful in field of Scientific computation.
Like we seen titen which runs at 17.56 PetaFlops which is the speed of that super computer. A normal Computer runs at around 80 gigaflops, My computer here Runs at Maximum of 140 gigaflops and a flagship phone runs at around 4 gigaflops.
Linpack itself does not support testing in parallel Computing HPlinpack or Highly parallel linpack is used.
Rmax: it is the performance IN GFLOPS For the largest Problem run on the machine
Nmax: is the size of the largest problem run on a machine
N1/2: the size where half the Rmax execution rate us achieved.
Rpeak: it is the theoretical peak performance of the machine in GIGAFLOPS.
1.Some algorithm can run more efficiently in cluster environment.
2.Decreasing processing time by taking the advantage of parallel processing
3.Solve bigger and Complex Programs is much easier and faster manner.
4.Allows for virtual testing. For example simulation of Virus acting on Human cells.
1.As we know raspberry pi only has 1 gb of onboard memory
2.Some programs can only utilize 25% of total cluster capability. They use one core per node making it inefficient
3.A modern Core i7 can easily beat 4 node rpi cluster in single threadded workloads.
4.Building a rpi cluster on bigger scale makes it more expensive
5.It is only Good for some type of parallel Workloads.
1.Increasing the efficiency of cluster
2.Create a more convenient power distribution system to each raspberry pi.