Solving the network, storage, and VM issues that cause the timeout error when backing up from Proxmox to CIFS

Question:

How to resolve the timeout error when backing up large VMs from Proxmox to CIFS storage?

I inherited a Proxmox PVE 5.4-13 environment that I need to migrate to Hyper-V as per my boss’s preference. I have successfully backed up and converted a small VM using qemu-img, but I encounter a timeout error when I try to do the same for two larger VMs. The error occurs at 99% of the backup process, when I use a CIFS storage on a file server as the backup destination. The error message is:

2024-01-31 21:56:36 ERROR: VM 107 qmp command ‘query-backup’ failed – got timeout

I have tried to consolidate multiple disks into one, and to use compressed backups, but neither of these solutions worked. I have limited Linux and Proxmox knowledge, so I am looking for some guidance on how to troubleshoot and fix this issue.

Answer:

If you are trying to migrate your virtual machines (VMs) from Proxmox to Hyper-V, you might encounter a timeout error when backing up large VMs to a CIFS storage. This error can be frustrating, as it usually happens at the end of the backup process, and prevents you from converting the backup files to VHD format using qemu-img. In this article, I will explain the possible causes of this error, and suggest some solutions that might work for you.

What causes the timeout error?

The timeout error occurs when the Proxmox backup server (PBS) fails to communicate with the Proxmox virtual environment (PVE) using the QEMU machine protocol (QMP). The QMP is a JSON-based protocol that allows PBS to control and monitor the VMs on PVE. The QMP command ‘query-backup’ is used to check the status of the backup process, and if it does not receive a response within a certain time limit, it will report a timeout error.

There are several factors that can cause the QMP communication to fail, such as:

  • Network issues: If the network connection between PBS and PVE is slow, unstable, or interrupted, the QMP messages might not be delivered or received in time. This can happen if you are using a firewall, a VPN, or a different VLAN for PBS and PVE, or if your network hardware is overloaded or faulty.
  • Storage issues: If the CIFS storage that you are using as the backup destination is slow, unreliable, or incompatible, the backup process might take longer than expected, or fail to write the backup files correctly. This can happen if you are using an old or unsupported version of CIFS, or if your CIFS server is configured with insufficient permissions, quotas, or security settings.
  • VM issues: If the VM that you are backing up is running a heavy workload, has multiple disks, or has a large amount of RAM, the backup process might consume more resources and time than usual, or encounter errors during the snapshot creation or the file system freezing. This can happen if you are running applications that are sensitive to disk or memory changes, such as databases, web servers, or mail servers.
  • How to fix the timeout error?

    Depending on the cause of the timeout error, you might need to try different solutions to fix it. Here are some suggestions that might help you:

  • Network solutions: You should check the network connection between PBS and PVE, and make sure that it is fast, stable, and secure. You can use tools such as ping, traceroute, or iperf to test the network performance and latency. You should also check the firewall, VPN, or VLAN settings, and make sure that they allow the QMP communication on port 5900. You might need to adjust the firewall rules, the VPN configuration, or the VLAN routing to improve the network connectivity.
  • Storage solutions: You should check the CIFS storage that you are using as the backup destination, and make sure that it is reliable, compatible, and accessible. You can use tools such as smbclient, mount, or df to test the CIFS connection and the available space. You should also check the CIFS version, the permissions, the quotas, and the security settings, and make sure that they support the backup process. You might need to upgrade the CIFS version, change the permissions, increase the quotas, or disable the security features that might interfere with the backup process.
  • VM solutions: You should check the VM that you are backing up, and make sure that it is not running a heavy workload, has a single disk, and has a reasonable amount of RAM. You can use tools such as top, htop, or free to monitor the VM performance and resource usage. You should also check the applications that are running on the VM, and make sure that they are not sensitive to disk or memory changes. You might need to stop or reduce the workload, consolidate the disks, or decrease the RAM of the VM before backing it up.
  • Conclusion

    The

timeout error when backing up large VMs from Proxmox to CIFS storage is a common problem that can be caused by various factors, such as network, storage, or VM issues. To fix this error, you need to identify the root cause, and apply the appropriate solution. By following the suggestions in this article, you might be able to resolve the timeout error, and successfully migrate your VMs from Proxmox to Hyper-V.

Leave a Reply

Your email address will not be published. Required fields are marked *

Privacy Terms Contacts About Us