Troubleshooting
Is your issue not listed here?
If the troubleshooting page is missing an error you encountered, please report it on GitHub by opening an issue. By doing so, you will help improve the project and help others find the solution to the same problem faster.
General errors🔗︎
Virtualenv not found🔗︎
Error
Output: /bin/sh: 1: virtualenv: not found
/bin/sh: 2: ansible-playbook: not found
Explanation
The error indicates that the virtualenv
is not installed.
Solution
There are many ways to install virtualenv
. For all installation options you can refere to their official documentation - Virtualenv installation.
For example, virtualenv can be installed using pip
.
First install pip.
sudo apt install python3-pip
Then install virtualenv using pip3.
pip3 install virtualenv
KVM/Libvirt errors🔗︎
Failed to connect socket (No such file or directory)🔗︎
Error
Error: virError(Code=38, Domain=7, Message='Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory')
Explanation
The problem may occur when libvirt is not started.
Solution
Make sure that the libvirt
service is running:
sudo systemctl status libvirtd
If the libvirt
service is not running, start it:
sudo systemctl start libvirtd
Optional: Start the libvirt
service automatically at boot time:
sudo systemctl enable libvirtd
Failed to connect socket (Permission denied)🔗︎
Error
Error: virError(Code=38, Domain=7, Message='Failed to connect socket to '/var/run/libvirt/libvirt-sock': Permission denied')
Explanation
The error indicates that either the libvirtd
service is not running or the current user is not in the libvirt
(or kvm
) group.
Solution
If the libvirtd
service is not running, start it:
sudo systemctl start libvirtd
Add the current user to the libvirt
and kvm
groups if needed:
# Add current user to groups
sudo adduser $USER libvirt
sudo adduser $USER kvm
# Verify groups are added
id -nG
# Reload user session
Error creating libvirt domain🔗︎
Error
Error: Error creating libvirt domain: … Could not open '/tmp/terraform_libvirt_provider_images/image.qcow2': Permission denied')
Explanation
The error indicates that the file cannot be created in the specified location due to missing permissions.
- Make sure the directory exists.
- Make sure the directory of the file that is being denied has appropriate user permissions.
- Optionally qemu security driver can be disabled.
Solution
Make sure the security_driver
in /etc/libvirt/qemu.conf
is set to none
instead of selinux
. This line is commented out by default, so you should uncomment it if needed:
# /etc/libvirt/qemu.conf
...
security_driver = "none"
...
Do not forget to restart the libvirt
service after making the changes:
sudo systemctl restart libvirtd
Libvirt domain already exists🔗︎
Error
Error: Error defining libvirt domain: virError(Code=9, Domain=20, Message='operation failed: domain 'your-domain' already exists with uuid '...')
Explanation
The error indicates that the libvirt domain (virtual machine) already exists.
Solution
The resource you are trying to create already exists. Make sure you destroy the resource:
virsh destroy your-domain
virsh undefine your-domain
You can verify that the domain was successfully removed:
virsh dominfo --domain your-domain
If the domain was successfully removed, the output should look something like this:
error: failed to get domain 'your-domain'
Libvirt volume already exists🔗︎
Error
Error: Error creating libvirt volume: virError(Code=90, Domain=18, Message='storage volume 'your-volume.qcow2' exists already')
and / or
Error:Error creating libvirt volume for cloudinit device cloud-init.iso: virError(Code=90, Domain=18, Message='storage volume 'cloud-init.iso' exists already')
Explanation
The error indicates that the specified volume already exists.
Solution
Volumes created by Libvirt are still attached to the images, which prevents a new volume from being created with the same name. Therefore, these volumes must be removed:
virsh vol-delete cloud-init.iso --pool your_resource_pool
and / or
virsh vol-delete your-volume.qcow2 --pool your_resource_pool
Libvirt storage pool already exists🔗︎
Error
Error: Error storage pool 'your-pool' already exists
Explanation
The error indicates that the libvirt storage pool already exists.
Solution
Remove the existing libvirt storage pool.
virsh pool-destroy your-pool && virsh pool-undefine your-pool
Failed to apply firewall rules🔗︎
Error
Error: internal error: Failed to apply firewall rules /sbin/iptables -w --table filter --insert LIBVIRT_INP --in-interface virbr2 --protocol tcp --destination-port 67 --jump ACCEPT: iptables: No chain/target/match by that name.
Explanation
Libvirt was already running when firewall (usually FirewallD) was started/installed. Therefore, libvirtd
service must be restarted to detect the changes.
Solution
Restart the libvirtd
service:
sudo systemctl restart libvirtd
Failed to remove storage pool🔗︎
Error
Error: error deleting storage pool: failed to remove pool '/var/lib/libvirt/images/local-k8s-cluster-main-resource-pool': Directory not empty
Explanation
The pool cannot be deleted because there are still some volumes in the pool. Therefore, the volumes should be removed before the pool can be deleted.
Solution
-
Make sure the pool is running.
virsh pool-start --pool local-k8s-cluster-main-resource-pool
-
List volumes in the pool.
virsh vol-list --pool local-k8s-cluster-main-resource-pool # Name Path # ------------------------------------------------------------------------------------- # base_volume /var/lib/libvirt/images/local-k8s-cluster-main-resource-pool/base_volume
-
Delete listed volumes from the pool.
virsh vol-delete --pool local-k8s-cluster-main-resource-pool --vol base_volume
-
Destroy and undefine the pool.
virsh pool-destroy --pool local-k8s-cluster-main-resource-pool virsh pool-undefine --pool local-k8s-cluster-main-resource-pool
HAProxy load balancer errors🔗︎
Random HAProxy (503) bad gateway🔗︎
Error
HAProxy returns a random HTTP 503 (Bad gateway) error.
Explanation
More than one HAProxy processes are listening on the same port.
Solution 1
For example, if an error is thrown when accessing port 80
, check which processes are listening on port 80
on the load balancer VM:
netstat -lnput | grep 80
# Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
# tcp 0 0 192.168.113.200:80 0.0.0.0:* LISTEN 1976/haproxy
# tcp 0 0 192.168.113.200:80 0.0.0.0:* LISTEN 1897/haproxy
If you see more than one process, kill the unnecessary process:
kill 1976
Note: You can kill all HAProxy processes and only one will be automatically recreated.
Solution 2
Check the HAProxy configuration file (config/haproxy/haproxy.cfg
) that it does not contain 2 frontends bound to the same port.