2
0
mirror of git://github.com/lxc/lxc synced 2025-08-30 20:49:31 +00:00
Commit Graph

10509 Commits

Author SHA1 Message Date
Christian Brauner
292a6d4852 conf: rework console setup
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:16 +02:00
Christian Brauner
7de17c5d7d file_utils: add open_at_same()
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:15 +02:00
Christian Brauner
914c8117e6 conf: use mount_fd() during console mounting
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:14 +02:00
Christian Brauner
c1e81360dc conf: use mount_fd() in lxc_setup_dev_console()
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:13 +02:00
Christian Brauner
2ca395e000 conf: use mount_fd() helper when mounting ttys
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:11 +02:00
Christian Brauner
97cb264385 mount_utils: add mount_fd()
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:10 +02:00
Christian Brauner
d9fd5a83df conf: stash pty_nr in struct lxc_terminal
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:09 +02:00
Christian Brauner
425875136b conf: move lxc_create_ttys() before pivot root
This is the last setup step that occured after pivot root.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:08 +02:00
Christian Brauner
b83fc7ff53 terminal: split out lxc_devpts_terminal() helper
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:07 +02:00
Christian Brauner
0f427a9f98 string_utils: cast __s64 to long long signed int
Link: https://launchpadlibrarian.net/550723147/buildlog_snap_ubuntu_focal_ppc64el_lxd-latest-edge_BUILDING.txt.gz
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:06 +02:00
Christian Brauner
e428dfdfc4 conf: merge devpts setup and move before pivot root
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:05 +02:00
Christian Brauner
d413c48628 terminal: don't use ttyname_r() for native terminal allocation
Since we can call that function from another mount namespace we need to
do this manually.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:04 +02:00
Christian Brauner
011d6eaaaa conf: add and use mount_beneath_fd()
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:03 +02:00
Christian Brauner
03fd5d968f conf: update comment
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:02 +02:00
Christian Brauner
e5cc3716b4 conf: use a relative path in symlinkat()
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:45:01 +02:00
Christian Brauner
b36c1c3936 conf: s/lxc_setup_devpts_parent/lxc_recv_devpts_from_child/g
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:44:59 +02:00
Christian Brauner
43e9379dc1 conf: attach devpts mount directly when new mount api can be used
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:44:58 +02:00
Christian Brauner
2d4cc531a4 conf: set source property for devpts
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:44:57 +02:00
Christian Brauner
ae8d0df554 conf: surface failures to setup console
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:44:56 +02:00
Stéphane Graber
afc9b615f3 Fix typos
This fixes all typos identified by lintian.

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
2021-08-02 14:44:54 +02:00
Christian Brauner
06520b0915 conf: ensure devpts_fd is set to -EBADF
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:44:52 +02:00
Christian Brauner
1ff9846c1c terminal: ttyname_r() returns an error number on failure
In other words, how inconsistent can an API be?

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:44:51 +02:00
Christian Brauner
be606e16fd conf: use new mount api for devpts setup
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-02 14:44:45 +02:00
Petr Malat
72ddf4aa86 bpf: bpf_devices_cgroup_supported() should check if bpf() is available
bpf_devices_cgroup_supported() tries to load a simple BPF program to
test if BPF works. This is problematic because the function used to load
the program - bpf_program_load_kernel() - emits an error to the log if
BPF is not enabled in the kernel although device controller is not
requested in the configuration. Users could interpret that as a problem.

Make bpf_devices_cgroup_supported() check if the BPF syscall is available
before calling bpf_program_load_kernel(). We can do it by passing a NULL
pointer instead of the syscall argument as the kernel returns either
ENOSYS, when the syscall is not implemented or EFAULT, when it is
implemented.

Signed-off-by: Petr Malat <oss@malat.biz>
2021-07-22 09:25:33 +02:00
Petr Malat
206128fc76 lxc_setup_ttys: Handle existing ttyN file without underlying device
If a device file is opened and there isn't the underlying device,
the open call fails with ENXIO, but the path can be opened with
O_PATH, which is enough for mounting over the device file.

Generalize this idea and use O_PATH for all cases when the file
is there. One still must check for both ENXIO and EEXIST as it's
unspecified what error is reported if multiple error conditions
occur at the same time.

Signed-off-by: Petr Malat <oss@malat.biz>
2021-07-22 09:25:30 +02:00
Stoiko Ivanov
5189fc4820 cgroups: remove unneeded variables from cgroup_tree_create
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2021-07-22 09:25:27 +02:00
Stoiko Ivanov
c62b32b0f2 cgroups: populate hierarchy for device cgroup
With the changes introduced in:
b7b1e3a34c
the hierarchy-struct did not have the path_lim set anymore, which is
needed by setup_limits_legacy (->cg_legacy_set_data->lxc_write_openat)
to actually access the cgroup directory.

The issue can be reproduced with a container config having
```
lxc.cgroup.devices.deny = a
```
(or any lxc.cgroup.devices entry) set on a system booted with
systemd.unified_cgroup_hierarchy=0.

This affects all privileged containers on PVE (due to the default
devices.deny entry).

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2021-07-22 09:25:19 +02:00
Stéphane Graber
d867b94c22 Release LXC 4.0.10
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
lxc-4.0.10
2021-07-16 16:30:14 -04:00
Christian Brauner
cb6fd3e26d terminal: fix error handling
Fixes: f382bcc6d8 ("terminal: log TIOCGPTPEER failure less alarmingly")
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-15 22:14:39 +02:00
Christian Brauner
e3e69becce af_unix: report error when no fd is to be sent
Fixes: #3624
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-15 21:55:08 +02:00
Christian Brauner
0b9a29b541 terminal: log TIOCGPTPEER failure less alarmingly
This is not a fatal error and the fallback codepath is equally safe.
When we use TIOCGPTPEER we're using a stashed fd to the container's
devpts mount's ptmx device and allocating a new fd non-path based
through this ioctl. If this ioctl can't be used we're falling back to
allocating a pts device from the host's devpts mount's ptmx device which
is path-based but is not under control of the container and so that's
safe. The difference is just that the first method gets you a nice
native terminal with all the pleasantries of having tty and friends
working whereas the latter method does not.

Fixes: #3625
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-15 21:55:06 +02:00
Christian Brauner
401e36705c sync: fix log message
Fixes: #3875
Suggested-by: Hank.shi <shk242673@163.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-15 21:55:04 +02:00
Christian Brauner
c18430b001 start: fix logging message
Fixes: #3875
Suggested-by: Hank.shi <shk242673@163.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-15 21:55:00 +02:00
Christian Brauner
115c823151 initutils: include pthread.h
Otherwise we might end up with implicit function declaration warnings.

Link: https://jenkins.linuxcontainers.org/job/lxc-build-android/8915/console
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-15 18:13:37 +02:00
Serge Hallyn
7b784065a9 doc/common_options: add trace and alert loglevels
Signed-off-by: Serge Hallyn <serge@hallyn.com>
2021-07-15 18:13:32 +02:00
Christian Brauner
946e8385da file_utils: surface ENOENT when falling back to openat()
Link: https://discuss.linuxcontainers.org/t/error-failed-to-retrieve-pid-of-executing-child-process
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-12 17:17:53 +02:00
Christian Brauner
2d7e6a7f0b lxc_unshare: fix network device handling
We were passing the wrong PID. Fix this!

Link: https://discuss.linuxcontainers.org/t/problem-with-moving-interface-new-network-namespace-in-lxc-unshare
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-12 17:17:52 +02:00
Christian Brauner
37324c231c lxc_unshare: make mount table private
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-12 17:17:51 +02:00
Wolfgang Bumiller
37f188d9ac confile: allow including nonexisting directories
If an include directive ends with a trailing slash, we now
always assume it is a directory and do not treat the
non-existence as an error.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-07-12 17:17:50 +02:00
Wolfgang Bumiller
76bdf15acd conf: userns.conf: include userns.conf.d
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-07-12 17:17:48 +02:00
KATOH Yasufumi
c0152679f1 doc: Fix typo in English lxc.container.conf(5)
Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
2021-07-12 17:17:45 +02:00
KATOH Yasufumi
0d2a619d1c doc: Add new idmap= option to Japanese lxc.container.conf(5)
Update for commit 1852be9048

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
2021-07-12 17:17:43 +02:00
KATOH Yasufumi
d7d93fb104 doc: Append description of net type field
Update for commit 320061b34f

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
2021-07-12 17:17:41 +02:00
KATOH Yasufumi
a14a6e9092 doc: Add eBPF-based device controller semantics to Japanese man page
Update for commit 5025f3a690

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
2021-07-12 17:17:37 +02:00
Christian Brauner
01dd32bf95 cmd/lxc-checkconfig: list cgroup namespaces and rename confusing ns_cgroup entry
Link: https://discuss.linuxcontainers.org/t/cgroup-namespace-required-in-lxc-checkconfig-and-config-cgroup-ns
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-01 17:13:56 +02:00
Christian Brauner
49f1fbec16 terminal: ensure newlines are turned into newlines+carriage return for terminal output
Fixes: #3879
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-01 17:13:55 +02:00
Christian Brauner
ff4b545f5e cgroups: handle funky cgroup layouts
Old versions of Docker emulate a cgroup namespace by bind-mounting the
container's cgroup over the corresponding controller:

/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime master:11 - cgroup cgroup rw,xattr,name=systemd
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime master:15 - cgroup cgroup rw,net_cls,net_prio
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime master:16 - cgroup cgroup rw,cpu,cpuacct
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime master:17 - cgroup cgroup rw,memory
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime master:18 - cgroup cgroup rw,devices
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime master:19 - cgroup cgroup rw,hugetlb
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime master:20 - cgroup cgroup rw,perf_event
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime master:21 - cgroup cgroup rw,cpuset
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime master:22 - cgroup cgroup rw,blkio
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime master:23 - cgroup cgroup rw,pids
/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7d4424e6_bb13_42f4_a47a_45a4828bf54d.slice/docker-d0b3604b67ac7930dd34ba3a796627e3e4717d12309e90a4afe3f38b6816ac98.scope /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime master:24 - cgroup cgroup rw,freezer

New versions of LXC always stash a file descriptor for the root of the
cgroup mount at /sys/fs/cgroup and then resolve the current cgroup
parsed from /proc/{1,self}/cgroup relative to that file descriptor. This
doesn't work when the caller's cgroup is mouned over the controllers.
Older versions of LXC simply counted such layouts as having no cgroups
available for delegation at all and moved on provided no cgroup limits
were requested. But mainline LXC would fail such layouts. While I would
argue that failing such layouts is the semantically clean approach we
shouldn't regress users so make mainline LXC treat such cgroup layouts
as having no cgroups available for delegation.

Fixes: #3890
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-01 17:13:53 +02:00
Christian Brauner
d50378b422 tests: add tests for read-only /sys with read-write /sys/devices/virtual/net
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-01 17:13:51 +02:00
Christian Brauner
e250f278bb conf: improve read-only /sys with read-write /sys/devices/virtual/net
Some tools require /sys/devices/virtual/net to be read-write. At the
same time we want all other parts of /sys to be read-only. To do this we
created a layout where we hade a read-only instance of sysfs mounted on
top of a read-write instance of sysfs:

`-/sys                                  sysfs                                                        sysfs      rw,nosuid,nodev,noexec,relatime
  `-/sys                                sysfs                                                        sysfs      ro,nosuid,nodev,noexec,relatime
    |-/sys/devices/virtual/net          sysfs                                                        sysfs      rw,relatime
    | `-/sys/devices/virtual/net        sysfs[/devices/virtual/net]                                  sysfs      rw,nosuid,nodev,noexec,relatime

This causes issues for systemd services that create a separate mount
namespace as they get confused to what mount options need to be
respected.

Simplify our mounting logic so we end up with a single read-only mount
of sysfs on /sys and a read-write bind-mount of /sys/devices/virtual/net:

├─/sys                                sysfs                                                                                  sysfs         ro,nosuid,nodev,noexec,relatime
│ ├─/sys/devices/virtual/net          sysfs[/devices/virtual/net]                                                            sysfs         rw,nosuid,nodev,noexec,relatime

Link: systemd/systemd#20032
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-07-01 17:13:49 +02:00
Simon Deziel
c73a232555 initutils: close dirfd in error path
Signed-off-by: Simon Deziel <simon.deziel@canonical.com>
2021-07-01 17:13:48 +02:00