Understanding the behavior of unshare CLONE_NEWNS
I wrote a small C program that simply does an unshare(CLONE_NEWNS) followed by system("bash").
The man page say开发者_如何学Gos that the process should have its own namespace. So, in the shell I tried unmount /cgroup (cgroup is mounted on the original machine).
When I do a mount in a shell on the machine, /cgroup is unmounted there too. Am I missing something here? I thought that CLONE_NEWNS was to let me unmount a file system from the process without affecting the main system.
(As an aside, you didn't need to write a program - you could just use the unshare(1)
utility).
It is unmounting the filesystem only in the new namespace, and leaving it mounted in the original - the problem is that mount
uses /etc/mtab
to produce the list of currently-mounted filesystems, and that's just an ordinary file that can be updated by the mount
command in the new namespace. This means that /etc/mtab
gets out of synch with what's really going on (since there's only one /etc/mtab
, but two mount namespaces).
Check /proc/mounts
instead, to see what's actually mounted in the current namespace.
Almost certainly, this behavior is because of shared subtrees, where the parent mount of /cgroup (i.e., /), is marked as a "shared" mount that propagates mount and unmount events to its peers (other instances of /) in other namespaces. You can verify this by looking at the state of the / mount in /proc/self/mountinfo. This behavior has been most likely established by systemd, which reverts the kernel's default of "private" mounts to "shared". To get "private" behavior, you'd need to make / private using
mount --make-private /
See also https://bbs.archlinux.org/viewtopic.php?id=194388 and also https://lwn.net/Articles/689856/
I did a test with unshare on fedora 19 kernel 3.10
unshare --mount /bin/bash
df -h /boot/
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 485M 238M 222M 52% /boot
umount /boot/
On a second shell
grep boot /proc/mounts
echo $?
1
Maybe i wrong something but the result is what i expected
The unshare works in fedora, not on ubuntu, and at the same time, if you try just CLONE_NEWNS, it can not work, seems unshare not quite same as directly call
clone(child_main, child_stack+STACK_SIZE, CLONE_NEWUTS | CLONE_NEWPID | CLONE_NEWNS | SIGCHLD, NULL);
this call, namespace operations can be seen from another namespace
精彩评论