libvirt-design

Overview

Libvirt is collection of software that provides a convenient way to manage virtual machines and other virtualization functionality, such as storage and network interface management. These software pieces include an API library, a daemon (libvirtd), and a command line utility (virsh).

An primary goal of libvirt is to provide a single way to manage multiple different virtualization providers/hypervisors.

The libvirt project:

  • is a toolkit to manage virtualization platforms
  • is accessible from C, Python, Perl, Go and more
  • is licensed under open source licenses
  • supports KVM, QEMU, Xen, Virtuozzo, VMWare ESX, LXC, and more
  • targets Linux, FreeBSD, Windows and macOS

libvirt

File layout

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
.
|-- daemon(daemon for accepting rpc call and dispatch it, remote dispatcher)
|-- gnulib
| |-- lib(gnulib copied from gnu for stable)
| `-- tests(test cases for gnulib)
|-- include
| `-- libvirt(header files of libvirt)
|-- po(internationalization, zh-CN, US etc)
|-- src(generic virt level API who calls driver API)
| |-- access(---driver for access control, write, read etc)
| |-- admin(--admin program)
| |-- bhyve(BSD hypervisor)
| |-- conf(---Parse xml cpu, memory, device etc)
| |-- cpu(---specific vcpu setting arm, x86 etc)
| |-- esx(VMWare ESX hypervisor)
| |-- hyperv(MS hypervisor)
| |-- interface(---driver for iface)
| |-- libxl(Xen hypervisor with libxenlight tool)
| |-- locking(--lockd daemon: By maintaining the locks in a standalone daemon, the main libvirtd daemon can be restarted without risk of losing locks for VM disk mutual exclusion)
| |-- logging(--logd daemon: logging run as a separate daemon(log from vm console, not libvirtd daemon logs), the main libvirtd daemon can be restarted without risk of losing logs)
| |-- lxc(kind of linux container system)
| |-- network(---dirver for network)
| |-- node_device(---driver for device)
| |-- nwfilter(---driver for network filter, iptables etc)
| |-- openvz(kind of Linux container system)
| |-- phyp(IBM Power Hypervisor)
| |-- qemu(qemu fully emulated, or qemu-kvm hypervisor)
| |-- remote(---client side remote driver who used rpc to libvirtd, make rpc call)
| |-- rpc(---rpc framework)
| |-- secret(---driver for secrets for storing and retrieving secret information)
| |-- security
| |-- storage(---driver for storage)
| |-- test(---driver for testing)
| |-- uml(user mode linux hypervisor)
| |-- util
| |-- vbox(virtual hypervisor)
| |-- vmware(workstation hypervisor)
| |-- vmx(VMWare file parser)
| |-- vz(Parallels Cloud Server Virtualization Solution)
| |-- xen(xen hypervisor)
| |-- xenapi(xen api)
| `-- xenconfig(xen config)
`-- tools(virsh tool)
|-- nss(connect with nss service)

nss is tool to configure name services, it lists databases as sources for obtaining that information

nss conf file
passwd: files ldap
shadow: files
group: files ldap

hosts: dns nis files

ethers: files nis
netmasks: files nis
networks: files nis
protocols: files nis
rpc: files nis
services: files nis

automount: files
aliases: files


The virtlockd daemon is a single purpose binary which focuses exclusively on the task of acquiring and holding locks on behalf of running virtual machines. It is designed to offer a low overhead, portable locking scheme can be used out of the box on virtualization hosts with minimal configuration overheads. It makes use of the POSIX fcntl advisory locking capability to hold locks, which is supported by the majority of commonly used filesystems.

virtlockd is a lock manager implementation for libvirt. It’s designed to prevent you from starting two virtual machines (eg. on different nodes in your cluster) which are backed by the same writable disk image, something which can cause disk corruption. It uses plain fcntl-based file locking, so it is ideal for use when you are using NFS to share your disk images.

RPC

libvirt uses a simple, variable length, packet based RPC protocol. All structured data within packets is encoded using the XDR standard., A program defines a set of procedures that it supports. The procedures can support call+reply method invocation, asynchronous events, and generic data streams. details please refer to libvirt rpc

RPC Frame = Len + Data

1
2
3
4
5
6
7
|~~~   Packet 1   ~~~|~~~   Packet 2   ~~~|~~~  Packet 3    ~~~|~~~

+-------+------------+-------+------------+-------+------------+...
| n=U32 | (n-4) * U8 | n=U32 | (n-4) * U8 | n=U32 | (n-4) * U8 |
+-------+------------+-------+------------+-------+------------+...

|~ Len ~|~ Data ~|~ Len ~|~ Data ~|~ Len ~|~ Data ~|~

Data = Header + Payload

1
2
3
4
5
+-------+-------------+---------------....---+
| n=U32 | 6*U32 | (n-(7*4))*U8 |
+-------+-------------+---------------....---+

|~ Len ~|~ Header ~|~ Payload .... ~|

Header
The header contains 6 fields, encoded as signed/unsigned 32-bit integers.

1
2
3
4
5
6
7
8
9
10
11
12
13
+---------------+
| program=U32 |
+---------------+
| version=U32 |
+---------------+
| procedure=S32 |
+---------------+
| type=S32 |
+---------------+
| serial=U32 |
+---------------+
| status=S32 |
+---------------+
  • program
    This is an arbitrarily chosen number that will uniquely identify the “service” running over the stream. (like rpc server as ‘remote’ service)

  • version
    This is the version number of the program, by convention starting from ‘1’. When an incompatible change is made to a program, the version number is incremented. Ideally both versions will then be supported on the wire in parallel for backwards compatibility.

  • procedure
    This is an arbitrarily chosen number that will uniquely identify the method call(function provided by server), or event associated with the packet. By convention, procedure numbers start from 1 and are assigned monotonically thereafter.

  • type
    This can be one of the following enumeration values

    • call: invocation of a method call
    • reply: completion of a method call
    • event: an asynchronous event
    • stream: control info or data from a stream
  • serial
    This is a number that starts from 1 and increases each time a method call packet is sent. A reply or stream packet will have a serial number matching the original method call packet serial. Events always have the serial number set to 0.

  • status
    This can one of the following enumeration values

    • ok: a normal packet. this is always set for method calls or events. For replies it indicates successful completion of the method. For streams it indicates confirmation of the end of file on the stream.
    • error: for replies this indicates that the method call failed and error information is being returned. For streams this indicates that not all data was sent and the stream has aborted
    • continue: for streams this indicates that further data packets will be following