qemu vs qemu-kvm: some performance measurements

2023-02-20 15:27 问答作者：

I conducted the following benchmark in qemu and qemu-kvm, with the following configuration:

CPU: AMD 4400 process dual core with svm enabled, 2G RAM
Host OS: OpenSUSE 11.3 with latest Patch, running with kde4
Guest OS: FreeDos
Emulated Memory: 256M
Network: Nil
Language: Turbo C 2.0
Benchmark Program: Count from 0000000 to 9999999. Display the counter on the screen
     by direct accessing the screen memory (i.e. 0xb800:xxxx)

It only takes 6 sec when running in qemu.

But it takes 89 sec when running in qemu-kvm.开发者_运维百科

I ran the benchmark one by one, not in parallel.

I scratched my head the whole night, but still not idea why this happens. Would somebody give me some hints?

KVM uses qemu as his device simulator, any device operation is simulated by user space QEMU program. When you write to 0xB8000, the graphic display is operated which involves guest's doing a CPU `vmexit' from guest mode and returning to KVM module, who in turn sends device simulation requests to user space QEMU backend.

In contrast, QEMU w/o KVM does all the jobs in unified process except for usual system calls, there's fewer CPU context switches. Meanwhile, your benchmark code is a simple loop which only requires code block translation for just one time. That cost nothing, compared to vmexit and kernel-user communication of every iteration in KVM case.

This should be the most probable cause.

Your benchmark is an IO-intensive benchmark and all the io-devices are actually the same for qemu and qemu-kvm. In qemu's source code this can be found in hw/*.

This explains that the qemu-kvm must not be very fast compared to qemu. However, I have no particular answer for the slowdown. I have the following explanation for this and I think its correct to a large extent.

"The qemu-kvm module uses the kvm kernel module in linux kernel. This runs the guest in x86 guest mode which causes a trap on every privileged instruction. On the contrary, qemu uses a very efficient TCG which translates the instructions it sees at the first time. I think that the high-cost of trap is showing up in your benchmarks." This ain't true for all io-devices though. Apache benchmark would run better on qemu-kvm because the library does the buffering and uses least number of privileged instructions to do the IO.

The reason is too much VMEXIT take place.

继续阅读：benchmarking performance qemu virtual-machine

qemu vs qemu-kvm: some performance measurements

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？