“Kernel Panic — Not Syncing: VFS: Unable to mount root fs on unknown-block(0,0)”

After upgrading the kernel and restating your EC2 server if the server does not come up and you are getting a Status check as 1/2.

Sahil Sardana
4 min readJan 18, 2021

To start the troubleshooting you need to check the System Log by following below steps:

You might see this error : “Kernel Panic — Not Syncing: VFS: Unable to mount root fs on unknown-block(0,0)”.

This can mean:

  1. Either the initrd declaration is missing from the new Kernel’s title statement in /boot/grub/grub.conf.
  2. or the initrd/initramfs file itself is missing from the /boot directory. (For more info about the boot process you can refer to my below article: https://sahil-sardana2020.medium.com/linux-boot-process-c9ae2f930e99 )

To recover this EC2 machine we will have to rollback the Default Kernel pointer to the previous version of Kernel that was running before Kernel Upgrade/Server Patching. We will have to follow the below steps:

  1. Login to your AWS Console and go to the EC2 Instances section.
  2. Select the server which is failing to startup and go to the Description section of the server and make a note of the Root Device Volume Information like :

3. Click on the /dev/sda1 and you will get info about the volume like below:

4. Click on the Volume and it will open the volumes section, In this section you need to make sure you are setting the name of Volume as server name along with block name(like /dev/sda1) so that you can easily identify the root volume that you need to work on. Also make a note of the Volume ID.

5. Now you need to have another EC2 machine where you have root access and it should be in the same subnet as the EC2 machine that you are recovering.

6. Now you need to detach the volume from the non-working EC2 Instance as below, make sure the non-working server is in stopped state:

7. Once the volume is detached from non-working server, you need to attach this volume to recovery server, which is in same subnet and you have root access. In case if you do not have such server then you can quickly setup a new EC2 server with minimal config.

8. Identify the InstanceID of the recovery server first and For the recovery volume you need to make sure the status of the Volume is showing available like below:

9. Now click on Actions >> Attach Volume

You can see that it is asking for the Instance ID or Name of the recovery server should be in same subnet/availability zone.

In Device, you can use the default value (i.e., /dev/xvdf). Please, take note of this value.

10. Mount the volume.

- Log into your recovery instance.

- Look the device name of the secondary volume (lsblk).

  • Mount your volume (example: sudo mount /dev/xvdf2 /mnt ).
  • While mounting you might get below error:

To troubleshoot this you can check Kernel’s dmesg logs and you can see below error :

“XFS Filesystem has duplicate UUID” it means you cannot mount your XFS partition.

To handle this error, you can mount your XFS filesystem at runtime using nouuid option with the mount command

sudo mount -o rw,nouuid /dev/xvdf2 /mnt

11. Confirm that our new volume has successfully been mounted.

-$ df -h

12. Set the chroot environment

-$ sudo mount — bind /dev /mnt/dev

-$ sudo mount — bind /proc /mnt/proc

-$ sudo mount — bind /sys /mnt/sys

-$ sudo mount — bind /run /mnt/run

-$ sudo chroot /mnt

13. Ensure the filesystem mounted at /boot and /tmp/ have free space

- df -h /boot /tmp

14. Check the kernel versions configured on Grub.

  • awk -F\’ ‘$1==”menuentry “ {print i++ “ : “ $2}’ /etc/grub2.cfg

15. Select an older version:

grub2-set-default 1

(where “1” is one of the options provided by the previous command)

16. Write the Grub configuration:

grub2-mkconfig -o /boot/grub2/grub.cfg

17. Umount the volume:

- Press CTRL+D to leave the chroot environment

- return to the recovery instance’s volume:

  • cd /

18. Umount the volume:

  • umount -fl /mnt

19. Detach the volume from the recovery instance and attach it back to the non-working EC2 instance as /dev/sda1.

20. Start the non-working server, it should start successfully now.

21. After this you can try upgrading the Kernel once again.

--

--