Troubleshooting common KVM problems


Updated: November 26, 2012

Some time back, I have given you two Xen troubleshooting articles, which discussed some of the more common problems you might encounter when trying to provision your environment using Xen technology. Today, we will do the same thing with KVM.

Personally, I think KVM is far more elegant when it comes to Python vomit verbosity and overall complexity, but you can still sometimes come across ugly errors that will make your life more difficult. This tutorial will teach you how to work around them, how to identify possible configuration and setup conflicts and a variety of other neat tricks. After me.

Tip 1: Bridge networking interface does not show up

Let's say you want to use a custom bridge device you created, called br0. However, when you try to setup your network and make your guests use the specific interface, it does not show in the dropdown menu in the Virtual Machine Manager. The image below shows the exact opposite, but imagine your desired device is not in the list of available adapters. This could happen, and I have seen this happen, so trust me on this one, citizens of the Internet.

Bridged networking setup

The solution is very simple - manually edit the configuration file for your domain. By default, KVM stores files in two locations, either /etc/kvm/vm or /etc/libvirt/qemu, so you're most likely to find your XML files there. Open the relevant one and manually change the bridged adapter details under <source bridge>. Close the file, start the virtual machine and enjoy.

<interface type='bridge'>
    <source bridge='br0'/>
    <mac address='52:54:00:0d:e6:4a'/>
</interface>

Tip 2: Biosdevname & no network

Biosdevname is a utility that tries to assign BIOS-given names to devices, preserving commonality and simplifying the logic of hardware administration, especially with network devices. If you happen to have many of those, with unique capabilities or functions, you will find it harder to identify them by generic ethX names, but if they were given unique strings, you would easily tell apart 1Gbps and 10Gbps adapters and suchlike. Naturally, this is mostly useful for enterprise systems, as people at home hardly ever have this dilemma.

However, the side effect of using biosdevname is that if it's used inside guest operating systems in your KVM-created virtual machines, you might end up without network, as virtualized devices will get physical names that not quite match.

There are several ways you can work around the problem. One, some versions of the utility are capable of detecting they are being invoked inside a virtual environment and will exit without machine any changes. Another one is to pass a kernel argument in the GRUB menu; biosdevname=0 will disable the utility from running.

A third option is to hack the udev rules used to assign names to network cards. Here's a very rudimentary example; this one is designed for a machine with a single network adapter, hence you will need a more dynamic logic for multiple cards. Or, as the commented text says, you will need to create a single separate line for each rule.

Biosdevname rules

You can make a manual change to get the classic assignment:

vi /etc/udev/rules.d/70-persistent-net.rules

Replace NAME string from "eth_biosname" to "ethX" or anything you need:

# PCI device ...
SUBSYSTEM=="net", ACTION=="add", ATTR(type)=="1",
KERNEL=="eth*", NAME="eth_biosname"

# PCI device ...
SUBSYSTEM=="net", ACTION=="add", ATTR(type)=="1",
KERNEL=="eth*", NAME="eth0"

You will get different rules depending on the virtual hardware you use in your virtual machines. For example, you could opt for Realtek, e1000 or virtio virtual hardware, resulting in other strings. Pay attention and make sure you match the solution to your specific environment.

Tip 3: Domain already exists

What happens if you try to define a new domain and it already exists, except you are having trouble finding its configuration file or declaration. virsh list does not have, virt-manager does not show it anywhere. So what do you do?

virsh define machine.xml
error: Failed to define domain from machine.xml
error: operation failed: domain 'machine' already exists
with uuid 883ab02f-1a67-7430-ef9a-2b59af52210e7

You will have to find the configuration file and delete it. And then restart libvirtd service.

updatedb
locate machine.xml
rm <full path to machine.xml>
/etc/init.d/libvirtd restart

Tip 4: Internal error Unable to find cgroup

This problem may manifest after restarting libvirtd, like in the example above, or completely unrelated. A full error message may look like this:

virsh create machine.xml
error: Failed to create domain from machine.xml
error: internal error Unable to find cgroup for machine

The reason for this bug could be related to systemd, as specified in the Bugzilla report, but it could also happen on your pristine machines not using systemd. In most cases, it's a very indelicate race between cgroups and libvirtd, caused by libvirtd service coming up before cgroups, one of the cgroups being deleted or not existing in the first place.

You can resolve the problem by editing the libvirtd configuration, /etc/libvirt/qemu.conf. inside this file, you need to edit the cgroups_controllers directive so it does not list any cgroups, in which case, libvirtd will be able to run without them.

cgroups_controllers = [ ]

After this, you will have to restart libvirtd again. Alternatively, you will have to manually create the necessary cgroups and assign the libvirtd process into the relevant subsystem.

Tip 5: Virtual machine vanishes from VMM on halt/reboot

This could be a very simple issue, in fact. After you halt or reboot the virtual machine, the virtual machine console closes. Your configurations are in place, but this is a major inconvenience, as you have to interfere in the machine management cycle.

You need to look for the on_reboot clause in the relevant XML file for your virtual machine and make sure the action is set to restart rather than destroy. There you go, that's all there is to it.

<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>

Tip 6: No reboot action; function is not supported

Again, very related to the previous tip. If you specify reboot instead of restart, you will learn this is an invalid command that KVM cannot execute. However, you will most likely see a rather prolific error:

libvirtError: this function is not supported by the hypervisor: virDomainReboot

If you use restart, all will be well.

Tip 7: Bootloader error after installation

Sometimes, you may see a GRUB error, most likely number 15, on first reboot following an installation from an external media source. This can happen if you leave the CD/DVD image attached to the virtual machine and selected as the first boot device in your XML file. The problem is similar to this VirtualBox bug, affecting some of the Linux distributions out there. For example:

root (hd0,1)
Filesystem type is ext2fs, partition type 0x83
kernel /boot/vmlinuz

Error 15: File not found

Press any key to continue...

You can resolve by unmounting the ISO image and rebooting the guest. This time, it should work well. Please note the issue will not manifest if you set your hard disk as the first bootable device, because KVM will automatically skip to the second available source, most likely PXE or CD/DVD, if it cannot find a valid partition table on the disk, which should be the case if you're only installing now.

And I guess we're done here for today.

More reading

You might be interested in several more KVM articles:

KVM storage and network guides, plus bridged networking

KVM + Virtualbox side-by-side usage howto

KVM cloning guide

Conclusion

Half a dozen tips for the first troubleshooting guide, not bad I think. This tutorial focuses on bridged networking setup, working around biosdevname, cgroups + libvirtd problems, unregistering duplicate domains auto-created by virt-manager, and resolving guest operating system restart woes.

Most of the work is done by editing configuration files and restarting services. We do not rely on GUI, as any action that can be translated into a command line can then be scripted for easier and fully automated management. Anyhow, I hope you liked it; if you have ideas for the sequel, ping me.

Cheers.

RSS Feed icon

del.icio.us del.icio.us stumbleupon stumble digg digg reddit reddit slashdot slashdot



Advertise!

Would you like to advertise your product/site on Dedoimedo?

Read more

Donate to Dedoimedo!

Do you want to
help me take early retirement? How about donating
some dinero to
Dedoimedo?

Read more

Donate