qemu-kvm
Overview
In this article, we only give you the knowledge of qemu-kvm without libvirt, say how to start vm by running qemu-kvm itself and others.
Simulated device
qemu-kvm can simulate serial, block, serial, net device inside vm based on virtio driver, the simulated virtio devices is located at /sys/class/virtio-ports/
, for simulated device, there are two sides need to be set from command line, backend in host, virtio in vm.
1 | $ qemu-kvm -chardev socket,id=charch0,path=/var/run/xagent/vm-HZVsuboAJh-test/xagent.sock -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charch0,id=channel0,name=agent.channel.0 -serial unix:/var/run/agent/vm-HZVsuboAJh-test/console.sock,server,nowait -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x2 |
Device Front End
A device front end is how a device is presented to the guest. The type of device presented should match the hardware that the guest operating system is expecting to see.
1 | # check support front end device and specific options for each type |
A front end is often paired with a back end, which describes how the host’s resources are used in the emulation.
Device Back End
The back end describes how the data from the emulated device will be processed by QEMU
. The configuration of the back end is usually specific to the class of device being emulated. For example serial devices will be backed by a --chardev
which can redirect the data to a file or socket or some other system. Storage devices are handled by --blockdev
which will specify how blocks are handled, for example being stored in a qcow2 file or accessing a raw host disk partition. Back ends can sometimes be stacked to implement features like snapshots.
While the choice of back end is generally transparent to the guest
, there are cases where features will not be reported to the guest if the back end is unable to support it.
Device Buses
Most devices will exist on a BUS of some sort. Depending on the machine model you choose (-M foo) a number of buses will have been automatically created. In most cases the BUS a device is attached to can be inferred, for example PCI devices are generally automatically allocated to the next free address of first PCI bus found
. However in complicated configurations you can explicitly specify what bus (bus=ID) a device is attached to along with its address (addr=N).
Some devices, for example a PCI SCSI host controller, will add an additional buses to the system that other devices can be attached to. A hypothetical chain of devices might look like:
–device foo,bus=pci.0,addr=0,id=foo –device bar,bus=foo.0,addr=1,id=baz
which would be a bar device (with the ID of baz) which is attached to the first foo bus (foo.0) at address 1. The foo device which provides that bus is itself is attached to the first PCI bus (pci.0).
serial device backend(chardev)
Qemu char device uses below format for backend, more options refer to qemu chardev options
-chardev backend,id=id[,mux=on|off][,options]
Backend is one of: null, socket, udp, file, pipe, console, serial, pty, stdio, tty, parallel and more...
. The specific backend will determine the applicable options. different types have different options for that specific type.
All devices must have an id, which can be any string up to 127 characters long. It is used to uniquely identify this device in other command line directives.
A character device may be used in multiplexing mode by multiple front-ends
. Specify mux=on to enable this mode. A multiplexer is a “1:N” device, and here the “1” end is your specified chardev backend, and the “N” end is the various parts of QEMU that can talk to a chardev. by default it’s disabled
Example
1 | -chardev file,id=mydev0,path=/tmp/test.s \ |
use virtio serial in the guest, in order to do this, you have to add a virtio pci serial device(hub)
into the PCI bus, then under this virtio serial device, you can create serial and console port.
virtio-serial-pci == virtio-serial
1 | -device virtio-serial-pci,id=virtio_serial_pci0 \ |
Socket option
-chardev socket,id=id[,TCP options or unix options][,server=on|off][,wait=on|off][,telnet=on|off][,websocket=on|off][,reconnect=seconds][,tls-creds=id][,tls-authz=id]
- Create a
two-way stream socket
, which can be either a TCP or a unix socket. A unix socket will be created if path is specified. Behaviour is undefined if TCP options are specified for a unix socket. - server=on|off specifies that the socket shall be a listening socket.
- wait=on|off specifies that QEMU should not block waiting for a client to connect to a listening socket.
- telnet=on|off specifies that traffic on the socket should interpret telnet escape sequences.
- websocket=on|off specifies that the socket uses WebSocket protocol for communication.
- reconnect sets the timeout for reconnecting on non-server sockets when the remote end goes away. qemu will delay this many seconds and then attempt to reconnect. Zero disables reconnecting, and is the default.
- tls-creds requests enablement of the TLS protocol for encryption, and specifies the id of the TLS credentials to use for the handshake. The credentials must be previously created with the -object tls-creds argument.
- tls-auth provides the ID of the QAuthZ authorization object against which the client’s x509 distinguished name will be validated. This object is only resolved at time of use, so can be deleted and recreated on the fly while the chardev server is active. If missing, it will default to denying access.
TCP and unix socket options are given below:
TCP options: port=port[,host=host][,to=to][,ipv4=on|off][,ipv6=on|off][,nodelay=on|off]
host for a listening socket specifies the local address to be bound. For a connecting socket species the remote host to connect to. host is optional for listening sockets. If not specified it defaults to 0.0.0.0.
port for a listening socket specifies the local port to be bound. For a connecting socket specifies the port on the remote host to connect to. port can be given as either a port number or a service name. port is required.
to is only relevant to listening sockets. If it is specified, and port cannot be bound, QEMU will attempt to bind to subsequent ports up to and including to until it succeeds. to must be specified as a port number.
ipv4=on|off and ipv6=on|off specify that either IPv4 or IPv6 must be used. If neither is specified the socket may use either protocol.
nodelay=on|off disables the Nagle algorithm.
unix options: path=path[,abstract=on|off][,tight=on|off]
- path specifies the local path of the unix socket. path is required. abstract=on|off specifies the use of the abstract socket namespace, rather than the filesystem. Optional, defaults to false. tight=on|off sets the socket length of abstract sockets to their minimum, rather than the full sun_path length. Optional, defaults to true.
Pty
-chardev pty,id=id
- Create a new pseudo-terminal on the host and connect to it. pty does not take any options.
Block device(backend)
USB
QEMU can emulate a PCI UHCI, OHCI, EHCI or XHCI USB controller. You can plug virtual USB devices, QEMU will automatically create and connect virtual USB hubs as necessary to connect multiple USB devices
XHCI controller
QEMU has XHCI host adapter support. The XHCI hardware design is much more virtualization-friendly when compared to EHCI and UHCI, thus XHCI emulation uses less resources (especially CPU). So if your guest supports XHCI (which should be the case for any operating system released around 2010 or later) we recommend using it:qemu -device qemu-xhci
or qemu -device nec-usb-xhci
XHCI supports USB 1.1, USB 2.0 and USB 3.0 devices, so this is the only controller you need. With only a single USB controller (and therefore only a single USB bus) present in the system there is no need to use the bus= parameter when adding USB devices, as there is only one controller!!!
EHCI controller
The QEMU EHCI Adapter supports USB 2.0 devices. It can be used either standalone or with companion controllers (UHCI, OHCI) for USB 1.1 devices
. The companion controller setup is more convenient to use because it provides a single USB bus supporting both USB 2.0 and USB 1.1 devices
EHCI standalone
When running EHCI in standalone mode you can add UHCI or OHCI controllers for USB 1.1 devices too
. Each controller creates its own bus though, so there are two completely separate USB buses:
One USB 1.1 bus driven by the UHCI controller and one USB 2.0 bus driven by the EHCI controller. Devices must be attached to the correct controller manually
EHCI compansionThe UHCI and OHCI controllers can attach to a USB bus created by EHCI as companion controllers
. This is done by specifying the masterbus
and firstport
properties. masterbus specifies the bus name the controller should attach to. firstport specifies the first port the controller should attach to, which is needed as usually one EHCI controller with six ports has three UHCI companion controllers with two ports each
.
Companion controller is defined as multiple USB host controllers (EHCI/OHCI/UHCI) that are wired to the same physical connector
such that a device, depending on some characteristic like speed, will be connected to a different controller even when plugged into the same connector
, so that for a port on EHCI bus, it can handle usb2.0 and usb1.1 devices, but from system views, you still see several controllers, each takes one PCI addresse!!!
Bus selection for USB devices
XHCI only(support usb1.1, usb2.0, usb3.0), only one usb bus(Recommanded)
1
2
3# no need to set bus=xx as there is only one usb controller
-device qemu-xhci \
-device usb-tabletEHCI bus(usb2.0) + UCHI(OHCI)(usb1.1), two separate buses, usb devices must set
bus=x
manually.1
2
3
4
5
6
7
8
9# USB 1.1(uhci,ohci controller) bus will carry the name usb-bus.0
# usb2.0(ehci controller) will carry the name ehci.0
# usb3.0(xhci controller) will carry the name usb1.0
# must set bus for each usb device as there are two separate buses.
# The '-usb' switch will make qemu create the UHCI controller as part of the PIIX3 chipset. The USB 1.1 bus will carry the name "usb-bus.0".
-usb \
-device usb-ehci,id=ehci \
-device usb-tablet,bus=usb-bus.0 \
-device usb-storage,bus=ehci.0,drive=usbstickCompanion controller, only one usb bus.
1
2
3
4
5
6
7
8
9
10
11
12
13
14# usb2.0 controller ehci six ports
# usb1.1 controller uchi(two ports) attached to ehci
-device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x3.0x7 \
-device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x3 \
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x3.0x1 \
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x3.0x2 \
-device usb-tablet
# inside guest
$ lspci
00:03.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03)
00:03.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03)
00:03.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03)
00:03.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03)USB devices can be connected with the -device
usb-...
command line option or the device_add monitor command. Available devices are:usb-mouse
usb-tablet
usb-kbd (Standard USB keyboard)
usb-audio
…
USB Hub
1 | # Plugging a hub into UHCI port 2 works like this: |
network dev
In order to use network device, you need to setup two sides front end and back end, but different types use different ways to setup backend, let’s say vhost-user and tap uses different way to setup back end, Here show tap interface as example, you have to do
- A bridge(ovs bridge or linux bridge) created by user.
A tap interface created by user or let qemu to create it automatically.
- Up tap interface and add it to the bridge, either by user manually from command line or provide
/etc/qemu-ifup which is called by qemu automatically
- Down tap interface and remove it from the bridge, either by user mannulay from command linie or provide
/etc/qemu-ifdown whih is called by qemu automatically
There are severals way to setup network devices. In short, the -net is the legacy option, while -netdev comes in to solve issue present for -net, the newest way -nic from 2.12 is easiest way to set up an interface.
- The -net option can create either a front-end or a back-end (but has disadvanges than -nic)
- -netdev can only create a back-end, use -device for front end
- Asingle occurrence of -nic will create both a front-end and a back-end.
NOTE
if you use libvirt, all these operations will be done by libvirt itself!!
linux bridge
1 | # create a linux bridge or ovs bridge if ovs installed, or use docker bridge(docker0) which is linux bridge |
Tap port with linux bridge example
ovs bridge
1 | # show ovs bridge |
Kernel and rootfs
Change vm setting runtime
The QEMU Monitor Protocol (QMP) is a JSON-based protocol which allows applications to communicate with a QEMU instance. there are several ways to talk with QEMU instance.
- virsh/libvirt way using ‘qemu-monitor-command’ which provided by virsh command
- if qemu-instance is not started by libvirt or no virsh installed, talk with
qmp socket
directly.- use
telnet, nc
overtcp qmp socket
with json format - use
nc
,socat over
unix qmp socket
with json format, whileqemu-shell
with human way as it’s python tool that does json encode for you.
- use
Same thing happens for HMP(human machine protocol) as well who creates monitor socket for talking.
Qemu parameters
- -monitor dev redirect the monitor to char device ‘dev’, will create monitor socket.
- -qmp dev like -monitor but opens in ‘control’ mode, will create qmp socket. same with ==
-monitor chardev=mon0,mode=control
1 | # ==================================QMP socket=================================================================================== |
NOTE
- both sockets(if monitor in control mode) support QMP(json based) and HMP(human machine protocol), they proivde different commands for similiar purpose,
- The outputs are different, QMP has more details and with json output, HMP is different.
- if monitor socket with readline mode, only support human commands and output is formated.
Suggestion
With virsh/libvirt
- Use
virsh qemu-monitor-command
- Use
Without virsh/libvirt
- For CLI troubleshooting only, run monitor socket with readline mode, provides man subcommands
- For programable, run monitor socket with control mode, or qmp socket, but you still has tool like
qemu-shell
which proivdes CLI for user to use with man subcommands. qemu-shell similiar to qemu-monitor-command
-
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59# without qmp-shell, when -nographic option is in use, you can switch between the monitor console
# by pressing Ctrl–a, then press c, back to console ctrl-a, then press c again.
$ ./qmp-shell /var/run/qmp.sock
Welcome to the QMP low-level shell!
Connected to QEMU 2.12.0
(QEMU) query-block
{"return": [{"locked": false, "tray_open": false, "io-status": "ok", "qdev": "/machine/unattached/device[22]", "removable": true, "device": "ide1-cd0", "type": "unknown"}, {"device": "floppy0", "type": "unknown", "qdev": "/machine/unattached/device[15]", "locked": false, "removable": true}, {"device": "sd0", "type": "unknown", "locked": false, "removable": true}]}
# HMP format
$ ./qmp-shell -H /var/run/qmp.sock
Welcome to the HMP shell!
Connected to QEMU 2.12.0
(QEMU) info block
ide1-cd0: [not inserted]
Attached to: /machine/unattached/device[22]
Removable device: not locked, tray closed
floppy0: [not inserted]
Attached to: /machine/unattached/device[15]
Removable device: not locked, tray closed
sd0: [not inserted]
Removable device: not locked, tray closed
(QEMU)
$ ./qmp-shell -p /var/run/qmp.sock
Welcome to the QMP low-level shell!
Connected to QEMU 2.12.0
(QEMU) query-block
{
"return": [
{
"locked": false,
"tray_open": false,
"io-status": "ok",
"qdev": "/machine/unattached/device[22]",
"removable": true,
"device": "ide1-cd0",
"type": "unknown"
},
{
"device": "floppy0",
"type": "unknown",
"qdev": "/machine/unattached/device[15]",
"locked": false,
"removable": true
},
{
"device": "sd0",
"type": "unknown",
"locked": false,
"removable": true
}
]
}
(QEMU)command output
Most QMP commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61"blockdev-add"
"chardev-remove"
"chardev-add"
"query-cpu-definitions"
"query-machines"
"device-list-properties"
"change-vnc-password"
"nbd-server-stop"
"nbd-server-add"
"nbd-server-start"
"query-block-jobs"
"query-balloon"
"query-migrate-capabilities"
"migrate-set-capabilities"
"query-migrate"
"query-command-line-options"
"query-uuid"
"query-name"
"query-spice"
"query-vnc"
"query-mice"
"query-status"
"query-kvm"
"query-pci"
"query-cpus"
"query-blockstats"
"query-block"
"query-chardev"
"query-events"
"query-commands"
"query-version"
"human-monitor-command"
"qmp_capabilities"
"expire_password"
"set_password"
"block_set_io_throttle"
"block_passwd"
"query-fdsets"
"remove-fd"
"add-fd"
"closefd"
"getfd"
"set_link"
"balloon"
"block_resize"
"netdev_del"
"netdev_add"
"client_migrate_info"
"migrate_set_downtime"
"migrate_set_speed"
"query-migrate-cache-size"
"migrate-set-cache-size"
"migrate_cancel"
"migrate"
"cpu-add"
"cpu"
"device_del"
"device_add"
"system_powerdown"
"system_reset"
"system_wakeup"
QMP by virsh qemu-monitor-command
1 | # 6095 is domain id |
QMP over tcp socket
1 | # run your qemu instance: -qmp tcp:127.0.0.1:12345,server,nowait |
QMP over unix socket
1 | $ /usr/libexec/qemu-kvm -kernel ./kernel -initrd ./initramfs.img -nographic -append "console=ttyS0" -qmp unix:/var/run/qmp.sock,server,nowait -serial mon:stdio |
disk image
Different hypervisor softwares uses different image format, here is a list of them.
VDI is the native format of VirtualBox
. Other virtualization software generally don’t support VDIVMDK is developed by and for VMWare
, but VirtualBox also support it. This format might be the the best choice for you because you want wide compatibility with other virtualization software. it has smaller disk size than qcow2VHD is the native format of Microsoft Virtual PC
. Windows Server2012 introduced VHDX as the successor to VHD
, but VirtualBox does not support VHDX.QCOW is the old original version of the qcow format
. It has beensuperseded by qcow2
, which VirtualBox does not support.QED was an abandoned enhancement of qcow2
. QEMU advises against using QED(not use it).
qemu-img is a tool which supports converting from one image format to another. if you want to run VM between hypervisors
Operation
qemu-img allows you to create, convert and modify images offline
. It can handle all image formats supported by QEMU
1 | $ qemu-img create -f qcow2 /var/lib/libvirt/images/disk1.img 10G |
NOTE: make sure make a copy of disk before operation
backing file(qcow2)
In essence, QCOW2(Qemu Copy-On-Write
) gives you an ability to create a base-image, and create several ‘disposable’ copy-on-write overlay disk images on top of the base image(also called backing file)
. Backing files and overlays are extremely useful to rapidly instantiate thin-privisoned virtual machines(more on it below). Especially quite useful in development & test environments, so that one could quickly revert to a known state & discard the overlay
. It can also be used to start 100 virtual machines from a common backing image, thus saving space
.
use case
1 | .--------------. .-------------. .-------------. .-------------. |
1 | # base <- sn1 <- sn2 <- sn3 |
FAQ
Boot with disk/diskless
1 | # root filesystem is in an ext2 "hard disk" |
how to terminate qemu process from console
1 | # must have -serial mon:stdio, otherwise Ctrl+A, then `X` not working |
Ctrl + A
, then pressX
1 | $ virt-install --boot kernel=./kernel,initrd=./initramfs.img,kernel_args="console=ttyS0,115200n8" --name $VM_NAME --memory 500 --vcpus 1 --disk none --os-type linux --graphics none --network default |
Ctrl + ]
to quit virt-install
how to start qemu with vnc
Add vnc parameter with specified ID, qemu-process will listen on particular port as for vnc id 0==5900
, so id 88 means qemu-process will listen on 5988.
1 | $ id=88 |
how to mount raw disk
1 | # create raw disk by dd or qemu-img create |
after enter qmp command, no output
This is probably, qmp.sock is connected with another client, as one qmp.sock can only talk with only on client!!!, if multiple clients want to connect with qemu instance, create multiple qmp socks when start qemu instance like this
1 | $ /usr/libexec/qemu-kvm -kernel ./kernel -initrd ./initramfs.img -nographic -append "console=ttyS0" -qmp unix:/var/run/qmp1.sock,server,nowait -qmp unix:/var/run/qmp2.sock,server,nowait -serial mon:stdio |
difference between -net, -netdev and -nic
In short, the -net is the legacy option, while -netdev comes in to solve issue present for -net, the newest way -nic from 2.12 is easiest way to set up an interface.
- The -net option can create either a front-end or a back-end (but has disadvanges than -nic)
- -netdev can only create a back-end, use -device for front end
- Asingle occurrence of -nic will create both a front-end and a back-end.
Suggestion
- The new -nic option gives you an
easy and quick way to configure the networking of your guest.
- For more detailed configuration, e.g. when you need to tweak the details(like queue size, buffer etc) of the emulated NIC hardware, you can use -device together with -netdev.
- The -net option should be avoided these days unless you really want to configure a set-up with a hub between the front-ends and back-ends.
-nic like a wrapper of -netdev and -deivce pair, easy to use but less control.
-net(legacy option)
QEMU’s initial way of configuring the network for the guest was the -net option, the emulated NIC and the host back-end are not directly connected. They are rather both connected to an emulated hub by default vlan0 (called “vlan” in older versions of QEMU). Therefore, if you start QEMU with -net nic,model=e1000 -net user -net nic,model=virtio -net tap
for example, you get a setup where all the front-ends and back-ends are connected together via a hub(vlan)
front end -net nic,model=xyz
or -net nic,model=virtio
and backend -net user
or -net tap
(e.g. -net user for the SLIRP back-end)
That means the e1000 NIC also gets the network traffic from the virtio-net NIC and both host back-ends, this can be solved by giving one hub for each nic, for example -net nic,model=e1000,vlan=0 -net user,vlan=0 -net nic,model=virtio,vlan=1 -net tap,vlan=1 moves the virtio-net NIC and the “tap” back-end to a second hub (with ID #1), Please note that the **“vlan” parameter will be dropped in QEMU v3.0 since the term was rather confusing (it’s not related to IEEE 802.1Q for example) *** and caused a lot of misconfigurations in the past.
-netdev
To configure a network connection where the emulated NIC is directly connected to a host network back-end, without a hub in between
, the well-established solution is to use the -netdev option for the back-end, together with -device for the front-end.
-netdev user,id=n1 -device e1000,netdev=n1 -netdev tap,id=n2 -device virtio-net,netdev=n2
Now while -netdev together with -device provide a very flexible and extensive way to configure a network connection, there are still two drawbacks with this option pair which prevented us from deprecating the legacy -net option completely:
- The -device option can only be used for pluggable NICs. Boards (e.g. embedded boards) which feature an on-board NIC cannot be configured with -device yet, so -net nic,netdev=
must be used here instead. - In some cases, the -net option is easier to use (less to type). For example, assuming you want to set up a “tap” network connection and your default scripts /etc/qemu-ifup and -down are already in place, it’s enough to type -net nic -net tap to start your guest. To do the same with -netdev, you always have to specify an ID here, too, for example like this: -netdev tap,id=n1 -device e1000,netdev=n1.
-nic
Looking at the disadvantages listed above, users could benefit from a convenience option that:
- is easier to use (and shorter to type) than -netdev
,id= -device ,netdev= - can be used to configure on-board / non-pluggable NICs, too
- does not place a hub between the NIC and the host back-end.
This is where the new -nic option kicks in!! this option can be used to configure both the guest’s NIC hardware and the host back-end in one go, instead of -netdev tap,id=n1 -device e1000,netdev=n1
you can simply type -nic tap,model=e1000
, you can simply run QEMU with -nic model=help. Beside being easier to use, the -nic option can be used to configure on-board NICs, For machines that have on-board NICs, the first -nic option configures the first on-board NIC, the second -nic option configures the second on-board NIC, and so forth.
qemu-kvm and qemu-kvm-ev
qemu-kvm and qemu-kvm-ev are usually built from the same src.rpm. Some newer or advanced virtualization features have been implemented in qemu-kvm-ev which are not able to be backported to qemu-kvm for compatibility reasons. Also, recently qemu-kvm-ev has a newer qemu-kvm version than the one provided by qemu-kvm being qemu-kvm-ev rebuilt from Red Hat Enterprise Virtualization.
why qemu needs nvdimm??
Really fast writes particularly interesting for:
- In-memory databases – get persistence for free*!
- Databases – transaction logs
- File & storage systems – frequently updated metadata
why PCI address(in guest) is not the same as we set?
Most of that each PCI device would show up in the guest OS with a PCI address that matches the one present in command line set by user, but that’s not guaranteed to happen and will in fact not be the case in all but the simplest scenarios. refer to Qemu pci address
reduce qcow file size on host
The virt-sparsify
utility, as we just saw, is what we want to use if we are dealing with a qcow2 image, which by default makes use of thin-provisioning, and we want to make the space previously allocated on the disk image and now not used anymore, available again on the host.
1 | $virt-sparsify --in-place disk.qcow2 |
how to enable IOMMU for VM
Enable IOMMU for VM, so that we can start embeded VM with device passthrough.
$ qemu-kvm -machine q35,accel=kvm -cpu host -device intel-iommu ...
this like enable iommu from hardware and enable it in bios, later we need add parameter to boot command line with intel_iommu=on iommu=pt
.
Inside this vm, we can start another vm with passthrough device from host(parent vm).