Integrating the LVM with Hadoop and provide the elasticity to DataNode storage

Hello guys, I am again back with my new post on integrating the LVM with the Hadoop. In this article I gonna discuss about the what is LVM? what is Hadoop? How it helps in integrating the LVM with Hadoop?

What is LVM?

LVM means Logical Volume Management. LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and resizing logical volumes.

The physical volumes are combined into logical volumes, with the exception of the /boot partition. The /boot partition cannot be on a logical volume group because the boot loader cannot read it. If the root (/) partition is on a logical volume, create a separate /boot partition which is not a part of a volume group.

What is Hadoop?

Hadoop is an open source tool from Apache. Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Why to integrate LVM with Hadoop?

On integrating the LVM with hadoop it provides the elasticity to the data node we can increase or decrease the storage at any point of time dynamically.

Steps for creating the LVM:

Before doing this practical we need a Hadoop cluster and a data node.

Step 1:

Go to the data node.

Add an external storage or hard disk to the Virtual machine instance or bare metal.

We can check the disks mounted by using the command

df -hT

We can check the disks that are attached to our system by the command

fdisk -l

Here we can see the Disk “/dev/xvdb” is the attached disk.

For creating the LVM we have to create the 3 steps:

  1. Create PV
  2. Create VG
  3. Create LV

Creating the PV (Physical Volume) by using the command

pvcreate /dev/xvdb . Here /dev/xvdb is the disk path .

We can see that the PV is created and we can view the information using the


Next, we need to create the VG(Volume Group) of PV’s we can even add more than 1 PV’s to the volume group

“vgcreate VG_name <PV1> <PV2>”

Creating the LV (Logical Volume) we need to create the logical volume.

lvcreate — size <size> — name <lv_name> <pv_name>

Formatting the logical volume and mounting the LV to the data node folder.

for formatting the command is “mkfs.ext4 <lv_path>”

Here we need to provide the complete path of LV. /dev/<vg>/<lv>

For mounting “mount <lv> <folder>”

Here , the command is “mount /dev/MyVg/MyLv /dn1”

where dn1 is the folder of the data node of hadoop cluster.

Here, we can see from the above image that LV is mounted on /dn1 .

That’s it we did it!!!!!

Thank you guys for giving your valuable time for reading my article

Hope it helped you!



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sriramadasu Prasanth Kumar

Sriramadasu Prasanth Kumar


MLOps| Hybrid Cloud | DevOps | Hadoop | Kubernets | Data Science| AWS | GCP |