Regenerate ssh host keys on boot on Ubuntu

As a part of preparing your linux VM to become template VM, you will delete all ssh host keys inside /etc/ssh/ folder:

rm -f /etc/ssh/*key*

This is not a problem with CentOS, which does check on every boot if host keys exist, and recreate them as neccessary.

But Ubuntu being smart ass, does not regenerate keys on boot, so after you delete all existing ssh host keys, be sure to add folowing to your /etc/rc.local file:

#Generate new host keys, if old ones are deleted"
test -f /etc/ssh/ssh_host_dsa_key || dpkg-reconfigure openssh-server

Be sure to add these lines before the “exit 0” line 😉

 

Shrinking size of qcow2 qemu images

KVM

You have provisioned virtual disk/image for your guest that is thin-provisioned, size 50GB.

image on host is /var/lib/libvirt/images/aa0d036a-f814-4cd8-991f-d0a0ad21a7d4 = 0MB currenty, since no data writen from inside VM

You install OS in your VM, write lot of data to diks (benchmark diks via DD ?) and then delete that data.

You current amount of disk space as seen from inside VM is 870 MB (CentOS minimal installed) ?

But size of image /var/lib/libvirt/images/aa0d036a-f814-4cd8-991f-d0a0ad21a7d4 on host is i.e. 40GB – how is that possible ?

You want to shirnk your image on hosts, since you really have only 1GB os data inside VM.

So :

From inside VM, fill the disk with zeros  – later we will use qemu-img to convert image, and qemu-img does nice job of compressing the zeros inside qcow2 files:

root@vm.local# dd if=/dev/zero of=/root/zeros bs=4m

and let this run until it completely fills your disk space inside VM.

then remove the file, and shutdown the guest VM:

root@vm.local# rm -f /root/zeros
root@vm.local# shutdown -h now

Then, on the host we convert existing image to new one, and qemu-img will compress all those zeros inside our image:

root@HOST# mv /var/lib/libvirt/images/aa0d036a-f814-4cd8-991f-d0a0ad21a7d4 /var/lib/libvirt/images/oldimage
root@HOST# qemu-img convert -O qcow2 /var/lib/libvirt/images/oldimage /var/lib/libvirt/images/aa0d036a-f814-4cd8-991f-d0a0ad21a7d4

will take a while, and remove old image

root@HOST# rm -f /var/lib/libvirt/images/oldimage
root@HOST#  du -hs aa0d036a-f814-4cd8-991f-d0a0ad21a7d4
871M aa0d036a-f814-4cd8-991f-d0a0ad21a7d4

There you go.

Cloudstack Libvirt and KVM issues after hard reboot

CentOS 6.4, CloudStack 4.2

I just had to force server reboot over IPMI (with few KVM VMs running), and after server booted, I could not start some of those VMs, from CloudStack UI.

After looking into cloudstack-agent log (/var/log/cloudstack/agent/agent.log), I found this suspicious errors:

2013-10-23 01:55:22,127 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-3:null) LibvirtException
org.libvirt.LibvirtException: operation failed: domain 'i-6-76-VM' already exists with uuid 56225e11-d3e1-33b8-a18f-3763d3481192
 at org.libvirt.ErrorHandler.processError(Unknown Source)
 at org.libvirt.Connect.processError(Unknown Source)
 at org.libvirt.Connect.domainCreateXML(Unknown Source)

It seems like the qemu did not do it’s clean up, so you need to manually delte all .xml files in /etc/libvirt/qemu/  folder:

rm /etc/libvirt/qemu/*.xml

Restart libvirtd,  and after that, you will be able to start your VMs again.

CentOS KVM and CEPH – client side setup

Once you have deployed the almighty CEPH storage, you will want to be able to actualy use it (RBD).

Before we begin, some notes:

Current CEPH version:  0.67 (“dumpling”).
OS: Centos 6.4 x86_64 (running some VMs on KVM, basic CentOS qemu packages, nothing custom)

Since CEPH RBD module was first introduced with kernel 2.6.34 (and current RHEL/CentOS kernel is 2.6.32) – that means we need a newer kernel.

So, one of the options for the new kernel is, 3.x from elrepo.org:

rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://elrepo.org/elrepo-release-6-5.el6.elrepo.noarch.rpm
yum --enablerepo=elrepo-kernel install kernel-ml # will install 3.11.latest, stable, mainline
     # or
yum --enablerepo=elrepo-kernel install kernel-lt # will install 3.0.latest, long term supported

If you want that new kernel to boot by default, edit /etc/grub.conf, and change the Default=1 to Default=0, and reboot.

That’s for the Kernel side.

If you want to use qemu with RBD, you will also need newer qemu packages – CEPH does provide some “untested”, “use it at your own risk” packages, that are rhel/centos based verions, just with RBD support added – well, running them on productional cloud, and all seems fine so far for me…

wget http://ceph.com/packages/ceph-extras/rpm/rhel6/x86_64/qemu-kvm-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm
wget http://ceph.com/packages/ceph-extras/rpm/rhel6/x86_64/qemu-img-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm
wget http://ceph.com/packages/ceph-extras/rpm/rhel6/x86_64/qemu-kvm-tools-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm

After done,  in order to actually install those CEPH-provided versions of binaries, we need to download and install CEPH, that is the required libraries librados and librbd:

rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'
rpm -Uvh http://ceph.com/rpm-dumpling/el6/noarch/ceph-release-1-0.el6.noarch.rpm
yum install ceph

If all fine (will not talk here about creating the /etc/ceph.conf file and keyring for the client – maybe another post), then we are ready to install CEPH version of qemu packages:

First, we need to remove the original packages from Centos 6, then install the new ones:

rpm -e --nodeps qemu-img
rpm -e --nodeps qemu-kvm
rpm -e --nodeps qemu-kvm-tools
rpm -Uvh qemu-*

And… that’s it. If you run the   qemu-img | grep “Supported formats”  you will see RBD as being one od those.
You may shutdown and restart all your existing VMs, in order to load the new qemu binaries.

System VMs does not start after upgrade to CloudStack 4.2

Folks from CloudStack thought it was fun to release CloudStack 4.2 with incomplete/incorrect documentation, so after you update CloudStack to 4.2, all your system VMs are not going to work, once you restart them…

BUG described here:  https://issues.apache.org/jira/browse/CLOUDSTACK-4826 with some proposed solutions

A full solutions is very well explained here:  http://cloud.kelceydamage.com/cloudfire/blog/2013/10/08/conquering-the-cloudstack-4-2-dragon-kvm/

In case the URL is down, here is a very much copy/paste from the original author, all credits goes to him.:

 

Step 1): Mount your secondary storage to your management server with the following:

mount -t nfs {ip_of_storage_server}:[path_to_secondary_storage] /mnt

Step 2): Download the latest version of the templates:

/usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt -m /mnt -u http://download.cloud.com/templates/4.2/systemvmtemplate-2013-06-12-master-kvm.qcow2.bz2 -h kvm -F

Step 3): Find the name of the old template in the database:

USE cloud
SELECT install_path FROM template_host_ref WHERE template_id=3;

Sample output:

template/tmpl/1/3//44d12c68-82f6-3d46-8073-9db0e835b94c.qcow2

Step 4): write down the name of the .qcow2 file that your given in the previous step.

Step 5): from the management server locate the new template on the mounted secondary storage:

cd /mnt/template/tmpl/1/3/

Step 6): rename the .qcow2 file in that folder to the name we copied from the database.

Step 7): edit the template.properties file in the same folder and change both instances of the old name to the new one.

Step 8): we need to reset the cached template in the database:

UPDATE template_spool_ref SET download_pct='0',download_state='NOT_DOWNLOADED',state='NULL',local_path='NULL',install_path='NULL',template_size='0' WHERE template_id='3';

Step 9): Unmount your secondary storage from the management server:

umount /mnt

Step 10): disable the zone from the management UI.

Step 11): update the database records for your system VMs to be ‘Stopped’. You will need to do this for both the Secondary Storage VM and the Console Proxy. The ID of the system VM is the number in it’s name, for example; s-34-VM,, would have an ID of ’34′.

UPDATE vm_instance SET state='Stopped' where id='{id_of_system_vm}';

Step 12): From the management UI, destroy both the system VMs.

Step 13): Once both system VMs have been destroyed, re-enable the zone.

Step 14): Tail the management log and watch for the VMs to start.

tail -f /var/log/cloudstack/management/management-server.log