Docker very slow in Fedora

Hi

I saw similar posts, but I did not find any working solution there.

Since I started to use Fedora 36 and docker (not podman, nor moby) I noted that many operations on containers are really slow! Much slower than with Ubuntu and Arch (same machine, different Linux distros; yes, I have 3 of them on the same machine: 16Gb, i7, SSD NVM, so a fast machine).

Much slower means: 10 times slower (an order of magnitude).

I note this slowness basically when:

  • running Ansible molecule docker tests
  • starting a MySql container

I seem to understand that the problem is I/O operations (e.g., with Ansible Molecule, the problem shows up with a task that installs a flatpak or installs a AUR package). This is an example of Ansible task that when tested with Molecule Docker in Fedora takes is 10 time slower than in Ubuntu and Arch:

- name: Install Gnome Extension Manager
  become: yes
  community.general.flatpak:
    name: com.mattjakeman.ExtensionManager
    state: present

Concerning the MySql container, this is a Maven POM that starts a MySql Docker container and waits for it to be ready (by intercepting a string on the log):

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.example</groupId>
  <artifactId>example-docker-mysql</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>pom</packaging>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <build>
    <pluginManagement>
      <plugins>
        <plugin>
          <groupId>io.fabric8</groupId>
          <artifactId>docker-maven-plugin</artifactId>
          <version>0.40.0</version>
          <extensions>true</extensions>
          <configuration>
            <showLogs>true</showLogs>
            <images>
              <image>
                <alias>database</alias>
                <name>mysql:5.7</name>
                <run>
                  <wait>
                    <log>MySQL init process done. Ready for start up.</log>
                    <time>200000</time>
                  </wait>
                  <env>
                    <MYSQL_ROOT_PASSWORD>apasswd</MYSQL_ROOT_PASSWORD>
                    <MYSQL_DATABASE>adatabase</MYSQL_DATABASE>
                    <MYSQL_USER>auser</MYSQL_USER>
                    <MYSQL_PASSWORD>anotherpasswd</MYSQL_PASSWORD>
                  </env>
                  <ports>
                    <port>${mysql.port}:3306</port>
                  </ports>
                </run>
              </image>
            </images>
          </configuration>
        </plugin>
      </plugins>
    </pluginManagement>
  </build>

</project>

This is the time in Fedora

[INFO] DOCKER> [mysql:5.7] "database":
Waited on log out 'MySQL init process done. Ready for start up.'
69823 ms

And this is the result in Ubuntu and Arch (the time is basically the same in the two distributions):

[INFO] DOCKER> [mysql:5.7] "database":
Waited on log out 'MySQL init process done. Ready for start up.' 
6032 ms

So, 10 times slower!

Moreover, when running these tests on Fedora (both the Ansible task and the Maven build), the disk usage is impressive, not to mention that the RAM is immediately filled!

This is the docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.17.12-300.fc36.x86_64
 Operating System: Fedora Linux 36 (Workstation Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.32GiB
 Name: lg-fedora
 ID: USS4:ISHJ:ID4Y:EPY3:QKR2:O5GQ:5OTP:MFHC:3OZK:MP27:L4DY:DYOO
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Fedora is installed on EXT4, just like the other distributions…

Any idea of the culprit?

1 Like

Would be nice if you post also the info at least one of them (Arch/Ubuntu)

Have a look here too:
Releases/36/ChangeSet - Fedora Project Wiki

1 Like

Right, here’s the info of Ubuntu

Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 2
  Running: 0
  Paused: 0
  Stopped: 2
 Images: 95
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux
 Default Runtime: runc
 Init Binary: docker-init
 containerd version:
 runc version:
 init version:
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.15.0-35-generic
 Operating System: Ubuntu 22.04 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.32GiB
 Name: kubuntu-lg
 ID: Z3CI:QQJN:5RFF:APWC:NXVT:YKLA:4LQP:7EKR:AMCA:6MN6:KUTD:QBWL
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

and this is the one of Arch

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  compose: Docker Compose (Docker Inc., 2.5.1)

Server:
 Containers: 2
  Running: 1
  Paused: 0
  Stopped: 1
 Images: 95
 Server Version: 20.10.16
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 212e8b6fa2f44b9c21b2798135fc6fb7c53efc16.m
 runc version:
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.15.43-1-lts
 Operating System: EndeavourOS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.32GiB
 Name: lg-eos
 ID: 6LEV:OBCB:BFH3:GDG5:66LT:HP52:5BGA:E2JA:4OFZ:YNND:24CO:6CAJ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Some additional experiment results. I tried all of the following tweaks without success (I tested singularly, not all together):

  • disabled selinux
  • barrier=0 in /etc/fstab
  • dump set to 0 (instead of 1) in /etc/fstab (as in Ubuntu and Arch, where it defaults to 0)
  • increased ulimit -Hn to 1048576 (as in Ubuntu and Arch) from the default in Fedora 524288
  • disabled systemd-oomd

Some additional findings: I have no problem with mysql image mysql:8.0.29, so probably it’s something that shows up only with the older version of MySQL, but, as I said, this happens only in Fedora. It might be something wrong in the old version of the image, but it should work with Fedora anyway.

I tried with a VM (VirtualBox) of Fedora 36 where I used the default BTRFS and it works (today, I’ll try with a bare-metal installation of Fedora 36 with BTRFS and see what happens).

I’m inclined to think that the problem is really in Fedora EXT4 somehow (an older post had similar issues Fedora 33 slow IO/kernel comparing to others and the user solved that by switching from EXT4 to XFS).

Unfortunately NOT :frowning:
I installed Fedora 36 (on an external SSD, not VM) with BTRFS, and I get exactly the same problems described in my original post.

Besides running inside a VM, the only difference I can think of is that the VM was allocated only 8 Gb of RAM, while on the real machine I have 16 Gb…

Since the main part of the problem is that Docker seems to immediately “eat” all the RAM, do you know if one can limit the amount of memory dedicated to Docker?

In any case, I’d really need some further directions because now I cannot think of further experiments…

Can you reproduce with a docker-compose.yml file? Then we can also test ourselves, and compare between docker and podman. Hard to say what’s going on there exactly.

Sure, this is a docker-compose.yaml to reproduce the problem

services:
  db:
    image: mysql:5.7
    environment:
      - MYSQL_ROOT_PASSWORD=somewordpress
      - MYSQL_DATABASE=wordpress
      - MYSQL_USER=wordpress
      - MYSQL_PASSWORD=wordpress
    expose:
      - 3306
      - 33060

As soon as the image has been downloaded and it starts the memory usage goes in a second to 16Gb (including swap usage) and the disk usage increases a lot. This can be seen from the gkrellm

If I don’t get an Out of memory error, after some time mysql is ready.

As I said, this happens with EXT4 and BTRFS on bare-metal (with 16Gb).

Changing the image to mysql:8.0.27 instead works.

Besides the docker compose I provided in my previous answer, please let me stress that I’m talking about docker-ce, not moby.

And, with the exact same docker-ce version, on Ubuntu it consumes less memory than Fedora?

If that’s the case, you should file a bug on bugzilla.redhat.com (if they ship docker-ce, if they don’t you should probably file it upstream). If you can’t reproduce with the same docker-ce version, it might be an upstream bug and/or regression.

I haven’t tested your example locally because I would need to do it in a virtual machine, I would like you to make sure first there aren’t any variables in your comparison.

As you can see from the docker info in this thread, the server version is not exactly the same in the 3 distributions (it’s always 20.10.x anyway).

Just to summarize, taking the docker-compose file you asked as a test:

  • everything works fine in Ubuntu and Arch
  • everything works fine in Fedora in a VM with only 8 Gb of RAM
  • in Fedora bare-metal, 16 Gb (in 3 different machines), both with BTRFS and EXT4 I experience the problem

In particular, in Ubuntu and Arch it works meaning that it only requires some memory and just a few seconds to start everything. With Fedora, in the configuration where I have the problem, all memory is immediately exhausted and then it takes several seconds for the container to be ready (unless I get an out of memory error).

Of course, in Fedora VM and bare-metal I’m using exactly the same docker-ce version. By docker-ce I mean the version taken from Docker repositories Install Docker Engine on Fedora | Docker Documentation because Fedora itself does not provide docker-ce anymore in their repositories, unless I missed something; Fedora only provides docker-compose and moby-engine).

Here are some positive updates: If I remove docker-ce taken from Docker repositories (sudo dnf remove docker-ce docker-ce-cli containerd.io) and install moby-engine from the Fedora repositories (of course, restart) EVERYTHING WORKS! The container starts fast with only a little memory needed.

I can only guess there’s some strange configuration in docker-ce with Fedora (with more 8 Gb of RAM) that causes the problem…

To sum up, since with moby-engine everything works, while with docker-ce there’s this serious problem, is it worthwhile to report the bug upstream? Or is anything else Fedora can do about that?

I would report it both upstream and downstream (on Redhat’s bugzilla) to maximise the chance of someone actually looking at this bug.