Skip to content

Troubleshooting

Is your issue not listed here?

If the troubleshooting page is missing an error you encountered, please report it on GitHub by opening an issue. By doing so, you will help improve the project and help others find the solution to the same problem faster.

General errors🔗︎

Virtualenv not found🔗︎

Error

Output: /bin/sh: 1: virtualenv: not found

/bin/sh: 2: ansible-playbook: not found

Explanation

The error indicates that the virtualenv is not installed.

Solution

There are many ways to install virtualenv. For all installation options you can refere to their official documentation - Virtualenv installation.

For example, virtualenv can be installed using pip.

First install pip.

sudo apt install python3-pip

Then install virtualenv using pip3.

pip3 install virtualenv

KVM/Libvirt errors🔗︎

Failed to connect socket (No such file or directory)🔗︎

Error

Error: virError(Code=38, Domain=7, Message='Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory')

Explanation

The problem may occur when libvirt is not started.

Solution

Make sure that the libvirt service is running:

sudo systemctl status libvirtd

If the libvirt service is not running, start it:

sudo systemctl start libvirtd

Optional: Start the libvirt service automatically at boot time:

sudo systemctl enable libvirtd

Failed to connect socket (Permission denied)🔗︎

Error

Error: virError(Code=38, Domain=7, Message='Failed to connect socket to '/var/run/libvirt/libvirt-sock': Permission denied')

Explanation

The error indicates that either the libvirtd service is not running or the current user is not in the libvirt (or kvm) group.

Solution

If the libvirtd service is not running, start it:

sudo systemctl start libvirtd

Add the current user to the libvirt and kvm groups if needed:

# Add current user to groups
sudo adduser $USER libvirt
sudo adduser $USER kvm

# Verify groups are added
id -nG

# Reload user session

Error creating libvirt domain🔗︎

Error

Error: Error creating libvirt domain: … Could not open '/tmp/terraform_libvirt_provider_images/image.qcow2': Permission denied')

Explanation

The error indicates that the file cannot be created in the specified location due to missing permissions.

  • Make sure the directory exists.
  • Make sure the directory of the file that is being denied has appropriate user permissions.
  • Optionally qemu security driver can be disabled.

Solution

Make sure the security_driver in /etc/libvirt/qemu.conf is set to none instead of selinux. This line is commented out by default, so you should uncomment it if needed:

# /etc/libvirt/qemu.conf

...
security_driver = "none"
...

Do not forget to restart the libvirt service after making the changes:

sudo systemctl restart libvirtd

Libvirt domain already exists🔗︎

Error

Error: Error defining libvirt domain: virError(Code=9, Domain=20, Message='operation failed: domain 'your-domain' already exists with uuid '...')

Explanation

The error indicates that the libvirt domain (virtual machine) already exists.

Solution

The resource you are trying to create already exists. Make sure you destroy the resource:

virsh destroy your-domain
virsh undefine your-domain

You can verify that the domain was successfully removed:

virsh dominfo --domain your-domain

If the domain was successfully removed, the output should look something like this:

error: failed to get domain 'your-domain'

Libvirt volume already exists🔗︎

Error

Error: Error creating libvirt volume: virError(Code=90, Domain=18, Message='storage volume 'your-volume.qcow2' exists already')

and / or

Error:Error creating libvirt volume for cloudinit device cloud-init.iso: virError(Code=90, Domain=18, Message='storage volume 'cloud-init.iso' exists already')

Explanation

The error indicates that the specified volume already exists.

Solution

Volumes created by Libvirt are still attached to the images, which prevents a new volume from being created with the same name. Therefore, these volumes must be removed:

virsh vol-delete cloud-init.iso --pool your_resource_pool

and / or

virsh vol-delete your-volume.qcow2 --pool your_resource_pool

Libvirt storage pool already exists🔗︎

Error

Error: Error storage pool 'your-pool' already exists

Explanation

The error indicates that the libvirt storage pool already exists.

Solution

Remove the existing libvirt storage pool.

virsh pool-destroy your-pool && virsh pool-undefine your-pool

Failed to apply firewall rules🔗︎

Error

Error: internal error: Failed to apply firewall rules /sbin/iptables -w --table filter --insert LIBVIRT_INP --in-interface virbr2 --protocol tcp --destination-port 67 --jump ACCEPT: iptables: No chain/target/match by that name.

Explanation

Libvirt was already running when firewall (usually FirewallD) was started/installed. Therefore, libvirtd service must be restarted to detect the changes.

Solution

Restart the libvirtd service:

sudo systemctl restart libvirtd

Failed to remove storage pool🔗︎

Error

Error: error deleting storage pool: failed to remove pool '/var/lib/libvirt/images/local-k8s-cluster-main-resource-pool': Directory not empty

Explanation

The pool cannot be deleted because there are still some volumes in the pool. Therefore, the volumes should be removed before the pool can be deleted.

Solution

  1. Make sure the pool is running.

    virsh pool-start --pool local-k8s-cluster-main-resource-pool
    

  2. List volumes in the pool.

    virsh vol-list --pool local-k8s-cluster-main-resource-pool
    
    #  Name         Path
    # -------------------------------------------------------------------------------------
    #  base_volume  /var/lib/libvirt/images/local-k8s-cluster-main-resource-pool/base_volume
    

  3. Delete listed volumes from the pool.

    virsh vol-delete --pool local-k8s-cluster-main-resource-pool --vol base_volume
    

  4. Destroy and undefine the pool.

    virsh pool-destroy --pool local-k8s-cluster-main-resource-pool
    virsh pool-undefine --pool local-k8s-cluster-main-resource-pool
    

HAProxy load balancer errors🔗︎

Random HAProxy (503) bad gateway🔗︎

Error

HAProxy returns a random HTTP 503 (Bad gateway) error.

Explanation

More than one HAProxy processes are listening on the same port.

Solution 1

For example, if an error is thrown when accessing port 80, check which processes are listening on port 80 on the load balancer VM:

netstat -lnput | grep 80

# Proto Recv-Q Send-Q Local Address           Foreign Address   State       PID/Program name
# tcp        0      0 192.168.113.200:80      0.0.0.0:*         LISTEN      1976/haproxy
# tcp        0      0 192.168.113.200:80      0.0.0.0:*         LISTEN      1897/haproxy

If you see more than one process, kill the unnecessary process:

kill 1976

Note: You can kill all HAProxy processes and only one will be automatically recreated.

Solution 2

Check the HAProxy configuration file (config/haproxy/haproxy.cfg) that it does not contain 2 frontends bound to the same port.