AMD GPU performs worse than Intel iGPU

I just installed Fedora 34 on my Dell XPS 15 (9575) which I explicitly bought because it has a dedicated AMD GPU. But now it turns out that on Fedora the AMD GPU performs worse than the integrated Intel GPU.

From what I understand the AMD GPU driver for Linux is open source and preinstalled on Fedora. So what could be causing this?


I ran this benchmark multiple times to make sure that this is not just a heat issue or something like that.
Of course I only ran one instance of glmark2 at a time.

Here is my dmesg output filtered for grep -iP "radeon|amd|gpu|graphics|video":

$ dmesg | grep -iP "radeon|amd|gpu|graphics|video"
[    0.007633] RAMDISK: [mem 0x1cb6d000-0x1f41afff]
[    0.007714] ACPI: SSDT 0x000000003EC66558 000F80 (v01 AmdRef AmdTabl  00001000 INTL 20160422)
[    0.048892] Reserving Intel graphics memory at [mem 0x4a800000-0x4e7fffff]
[    0.219004] ACPI: Added _OSI(Linux-Dell-Video)
[    0.219005] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[    0.368319] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    0.685201] efifb: showing boot graphics
[    1.336599] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <>
[    1.336600] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[    1.614120] [drm] amdgpu kernel modesetting enabled.
[    1.614232] ATPX Hybrid Graphics
[    1.614278] amdgpu: Topology: Add CPU node
[    1.614356] amdgpu 0000:01:00.0: enabling device (0006 -> 0007)
[    1.614399] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[    1.639010] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ATRM
[    1.639012] amdgpu: ATOM BIOS: 401815-171128-QS1
[    1.639062] [drm] GPU posting now...
[    1.651186] amdgpu 0000:01:00.0: BAR 2: releasing [mem 0xb0000000-0xb01fffff 64bit pref]
[    1.651188] amdgpu 0000:01:00.0: BAR 0: releasing [mem 0xa0000000-0xafffffff 64bit pref]
[    1.651196] amdgpu 0000:01:00.0: BAR 0: assigned [mem 0xa0000000-0xafffffff 64bit pref]
[    1.651201] amdgpu 0000:01:00.0: BAR 2: assigned [mem 0xb0000000-0xb01fffff 64bit pref]
[    1.651210] amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
[    1.651212] amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[    1.651356] [TTM] Zone  kernel: Available graphics memory: 8049666 KiB
[    1.651357] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[    1.651375] [drm] amdgpu: 4096M of VRAM memory ready
[    1.651377] [drm] amdgpu: 4096M of GTT memory ready.
[    1.651378] [drm] GART: num cpu pages 65536, num gpu pages 65536
[    1.662040] amdgpu: hwmgr_sw_init smu backed is vegam_smu
[    2.152260] Virtual CRAT table created for GPU
[    2.152311] amdgpu: Topology: Add dGPU node [0x694e:0x1002]
[    2.152314] amdgpu 0000:01:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 6, active_cu_number 20
[    2.331635] amdgpu 0000:01:00.0: amdgpu: Using ATPX for runtime pm
[    2.332175] [drm] Initialized amdgpu 3.40.0 20150101 for 0000:01:00.0 on minor 1
[    2.488355] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[    2.489672] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input19
[   17.423654] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
[14393.784296] amdgpu 0000:01:00.0: amdgpu: PCI CONFIG reset
[14393.784299] amdgpu 0000:01:00.0: amdgpu: GPU pci config reset

A little googling suggests that it might be due to “Hybrid Graphics”. Apparently the problem can be worked around by setting DRI_PRIME=1 before starting the game/application (PRIME - ArchWiki).

Nevermind, I just saw in your screenshot that you’ve already done that. I don’t know.

I used DRI_PRIME=1 in my screenshot on the right.

On the left I ran glmark2 (causing the iGPU to be used) and on the right I ran DRI_PRIME=1 glmark2 (causing the dGPU to be used).

There appear to be a lot of turntables for the amdgpu driver:

$ modinfo amdgpu | grep "^parm:"

Researching those might be a place to start if you are interested in performance tuning the AMD driver.

I’ve noticed this too. (I’m still on Fedora 33)

$ glmark2
glmark2 Score: 2994

$ DRI_PRIME=1 glmark2
glmark2 Score: 1846

My laptop is a Dell Precision 7740 Mobile Workstation with AMD Radeon Pro WX 3200 W/4GB GDDR5.

@robin217 @tavk

Please see if it applies to your case.

I suppose it’s the same issue, but there is no solution in that thread.

modinfo amdgpu | grep "^parm: returns the following for me:

modinfo amdgpu | grep "^parm:"
parm:           vramlimit:Restrict VRAM for testing, in megabytes (int)
parm:           vis_vramlimit:Restrict visible VRAM for testing, in megabytes (int)
parm:           gartsize:Size of GART to setup in megabytes (32, 64, etc., -1=auto) (uint)
parm:           gttsize:Size of the GTT domain in megabytes (-1 = auto) (int)
parm:           moverate:Maximum buffer migration rate in MB/s. (32, 64, etc., -1=auto, 0=1=disabled) (int)
parm:           benchmark:Run benchmark (int)
parm:           test:Run tests (int)
parm:           audio:Audio enable (-1 = auto, 0 = disable, 1 = enable) (int)
parm:           disp_priority:Display Priority (0 = auto, 1 = normal, 2 = high) (int)
parm:           hw_i2c:hw i2c engine enable (0 = disable) (int)
parm:           pcie_gen2:PCIE Gen2 mode (-1 = auto, 0 = disable, 1 = enable) (int)
parm:           msi:MSI support (1 = enable, 0 = disable, -1 = auto) (int)
parm:           lockup_timeout:GPU lockup timeout in ms (default: for bare metal 10000 for non-compute jobs and infinity timeout for compute jobs; for passthrough or sriov, 10000 for all jobs. 0: keep default value. negative: infinity timeout), format: for bare metal [Non-Compute] or [GFX,Compute,SDMA,Video]; for passthrough or sriov [all jobs] or [GFX,Compute,SDMA,Video]. (string)
parm:           dpm:DPM support (1 = enable, 0 = disable, -1 = auto) (int)
parm:           fw_load_type:firmware loading type (0 = direct, 1 = SMU, 2 = PSP, -1 = auto) (int)
parm:           aspm:ASPM support (1 = enable, 0 = disable, -1 = auto) (int)
parm:           runpm:PX runtime pm (2 = force enable with BAMACO, 1 = force enable with BACO, 0 = disable, -1 = PX only default) (int)
parm:           ip_block_mask:IP Block Mask (all blocks enabled (default)) (uint)
parm:           bapm:BAPM support (1 = enable, 0 = disable, -1 = auto) (int)
parm:           deep_color:Deep Color support (1 = enable, 0 = disable (default)) (int)
parm:           vm_size:VM address space size in gigabytes (default 64GB) (int)
parm:           vm_fragment_size:VM fragment size in bits (4, 5, etc. 4 = 64K (default), Max 9 = 2M) (int)
parm:           vm_block_size:VM page table size in bits (default depending on vm_size) (int)
parm:           vm_fault_stop:Stop on VM fault (0 = never (default), 1 = print first, 2 = always) (int)
parm:           vm_debug:Debug VM handling (0 = disabled (default), 1 = enabled) (int)
parm:           vm_update_mode:VM update using CPU (0 = never (default except for large BAR(LB)), 1 = Graphics only, 2 = Compute only (default for LB), 3 = Both (int)
parm:           exp_hw_support:experimental hw support (1 = enable, 0 = disable (default)) (int)
parm:           dc:Display Core driver (1 = enable, 0 = disable, -1 = auto (default)) (int)
parm:           sched_jobs:the max number of jobs supported in the sw queue (default 32) (int)
parm:           sched_hw_submission:the max number of HW submissions (default 2) (int)
parm:           ppfeaturemask:all power features enabled (default)) (hexint)
parm:           forcelongtraining:force memory long training (uint)
parm:           pcie_gen_cap:PCIE Gen Caps (0: autodetect (default)) (uint)
parm:           pcie_lane_cap:PCIE Lane Caps (0: autodetect (default)) (uint)
parm:           cg_mask:Clockgating flags mask (0 = disable clock gating) (uint)
parm:           pg_mask:Powergating flags mask (0 = disable power gating) (uint)
parm:           sdma_phase_quantum:SDMA context switch phase quantum (x 1K GPU clock cycles, 0 = no change (default 32)) (uint)
parm:           disable_cu:Disable CUs (,...) (charp)
parm:           virtual_display:Enable virtual display feature (the virtual_display will be set like xxxx:xx:xx.x,x;xxxx:xx:xx.x,x) (charp)
parm:           job_hang_limit:how much time allow a job hang and not drop it (default 0) (int)
parm:           lbpw:Load Balancing Per Watt (LBPW) support (1 = enable, 0 = disable, -1 = auto) (int)
parm:           compute_multipipe:Force compute queues to be spread across pipes (1 = enable, 0 = disable, -1 = auto) (int)
parm:           gpu_recovery:Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = auto) (int)
parm:           emu_mode:Emulation mode, (1 = enable, 0 = disable) (int)
parm:           ras_enable:Enable RAS features on the GPU (0 = disable, 1 = enable, -1 = auto (default)) (int)
parm:           ras_mask:Mask of RAS features to enable (default 0xffffffff), only valid when ras_enable == 1 (uint)
parm:           si_support:SI support (1 = enabled, 0 = disabled (default)) (int)
parm:           cik_support:CIK support (1 = enabled, 0 = disabled (default)) (int)
parm:           smu_memory_pool_size:reserve gtt for smu debug usage, 0 = disable,0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte (uint)
parm:           async_gfx_ring:Asynchronous GFX rings that could be configured with either different priorities (HP3D ring and LP3D ring), or equal priorities (0 = disabled, 1 = enabled (default)) (int)
parm:           mcbp:Enable Mid-command buffer preemption (0 = disabled (default), 1 = enabled) (int)
parm:           discovery:Allow driver to discover hardware IPs from IP Discovery table at the top of VRAM (int)
parm:           mes:Enable Micro Engine Scheduler (0 = disabled (default), 1 = enabled) (int)
parm:           noretry:Disable retry faults (0 = retry enabled, 1 = retry disabled, -1 auto (default)) (int)
parm:           force_asic_type:A non negative value used to specify the asic type for all supported GPUs (int)
parm:           sched_policy:Scheduling policy (0 = HWS (Default), 1 = HWS without over-subscription, 2 = Non-HWS (Used for debugging only) (int)
parm:           hws_max_conc_proc:Max # processes HWS can execute concurrently when sched_policy=0 (0 = no concurrency, #VMIDs for KFD = Maximum(default)) (int)
parm:           cwsr_enable:CWSR enable (0 = Off, 1 = On (Default)) (int)
parm:           max_num_of_queues_per_device:Maximum number of supported queues per device (1 = Minimum, 4096 = default) (int)
parm:           send_sigterm:Send sigterm to HSA process on unhandled exception (0 = disable, 1 = enable) (int)
parm:           debug_largebar:Debug large-bar flag used to simulate large-bar capability on non-large bar machine (0 = disable, 1 = enable) (int)
parm:           ignore_crat:Ignore CRAT table during KFD initialization (0 = auto (default), 1 = ignore CRAT) (int)
parm:           halt_if_hws_hang:Halt if HWS hang is detected (0 = off (default), 1 = on) (int)
parm:           hws_gws_support:Assume MEC2 FW supports GWS barriers (false = rely on FW version check (Default), true = force supported) (bool)
parm:           queue_preemption_timeout_ms:queue preemption timeout in ms (1 = Minimum, 9000 = default) (int)
parm:           debug_evictions:enable eviction debug messages (false = default) (bool)
parm:           no_system_mem_limit:disable system memory limit (false = default) (bool)
parm:           dcfeaturemask:all stable DC features enabled (default)) (uint)
parm:           dcdebugmask:all debug options disabled (default)) (uint)
parm:           abmlevel:ABM level (0 = off (default), 1-4 = backlight reduction level)  (uint)
parm:           backlight:Backlight control (0 = pwm, 1 = aux, -1 auto (default)) (bint)
parm:           tmz:Enable TMZ feature (-1 = auto, 0 = off (default), 1 = on) (int)
parm:           reset_method:GPU reset method (-1 = auto (default), 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco/bamaco) (int)
parm:           bad_page_threshold:Bad page threshold(-1 = auto(default value), 0 = disable bad page retirement) (int)
parm:           num_kcq:number of kernel compute queue user want to setup (8 if set to greater than 8 or less than 0, only affect gfx 8+) (int)

But I’m completely overwhelmed by that. Where do I even start?
Is it even guaranteed to be a driver config issue?

Interestingly when I run a webgl benchmark in firefox, the AMD GPU actually performs a bit better than the iGPU.

I got a score of 566 for the AMD GPU and a score of 375 for the Intel iGPU.

But still, the difference doesn’t quite seem right. The GPU should be quite a bit more powerful.
I wish I had a windows install to compare…

The synthetic benchmark might not tell the true life performance of a hardware setup.

It can also happen an overall stronger hardware can score lower scores in certain tests.

But in the first screenshot you can see that the iGPU outperformed the AMD GPU in 33 out of 33 categories.

According to the benchmarks listed on this site: Mobile Graphics Cards - Benchmark List - Tech
The AMD GPU in my system should completely stomp the iGPU into the ground.

I don’t know much about benchmarking , especially about Linux graphical stacks.

From the information you linked, I can see some benchmark is not in agreement with glmark2.

I simply don’t know which one is more representative.

It possible, you can run more benchmarks in your setup and compare the results.

Comparing the results may be a little more complicated than just “which is faster”. The path they use to move their computed results around can be significant. The dedicated GPU may be competing with other devices in the system to get its data to/from the CPU on the PCI bus in ways that the integrated GPU is not. A simple analogy might be comparing a modern supercar with a Model T Ford. Measured in terms of the “miles per hour” that they are capable of traveling, the supercar might be quite a bit faster. But if the supercar is trying to get through downtown LA and the Model T is traveling on a straight, flat, empty highway in Nevada, the Model T just might “go faster”.

Here is a post someone made back in 2007 when ATI (a graphics card company) and AMD were talking about merging their products to create APUs.

Excerpted from What is the benefit of integrated graphics (a la Fusion):

Ok I’ve been reading up on this topic for a lot of time, so here are the pros and cons.


  1. GPUs gain early access to advanced manufacturing technologies, however, this is only true if AMD is going to fab the Fusions in-house, for the foreseable future, it doesn’t look like they are.

  2. Even a low-end GPU, when integrated into the CPU, would significantly improve GFLOPS, count. If you are a gamer, you can use the GPU as a GPU until you get an AIB one, then use it as a ‘PPU’, if you are a supercomputer engineer, you can use these processors to improve your performance. Also add the possibility of playing a game while running F@H in the bcakground.

  3. Lower power consumption, there is no need for information to travel on buses, this would significantly reduce power consumption.

  4. Cheaper cost of manufacture, although the die will be larger, it is still cheaper than two separate dies.

  5. Smaller computers

  6. Ray tracing will also become possible, imagine a ‘graphics-centric’ Fusion where you can dedicate more die are\transistors to the GPU, and have a small CPU next to that GPU.

  7. Better and cheaper gaming laptops, only if AMD will give us the ability to buy these ‘graphics-centric’ Fusions. I do agree that the gamer market is a very niche one, but keep in mind that some supercomputer engineers would want more FP performance with every processor they buy, so AMD would have more than one market to sell these Fusion processors.

  8. More CPU to GPU bandwidth.

  • The on die GPU can run at the same clockspeed of the CPU, even a low-end GPU clocked 4X times higher than any other one has big ability to compete. however, this isn’t 100% true, although AMD said it might happen, so that is why i dont really want to include it as a pro.


  1. Power consumption and heat given off by the GPU are a BIG headache when trying to make such a processor.

  2. Memory bandwidth. Most of us know that GDDR memory of all types is much faster than normal DDR memory. The GPU would suffer memory bottlenecks which might decrease its performance, however, this can be solved by using eDRAM.

The discrete GPU should perform at least 2x better than the APU, it’s really strange: look at these results from my old Vega64 on a 4K desktop:

    glmark2 2021.02
    OpenGL Information
    GL_VENDOR:     AMD
    GL_RENDERER:   Radeon RX Vega (VEGA10, DRM 3.40.0, 5.11.17-300.fc34.x86_64, LLVM 12.0.0)
    GL_VERSION:    4.6 (Compatibility Profile) Mesa 21.0.3
[build] use-vbo=false: FPS: 9912 FrameTime: 0.101 ms
[build] use-vbo=true: FPS: 14875 FrameTime: 0.067 ms
[texture] texture-filter=nearest: FPS: 13863 FrameTime: 0.072 ms
[texture] texture-filter=linear: FPS: 14892 FrameTime: 0.067 ms
[texture] texture-filter=mipmap: FPS: 14748 FrameTime: 0.068 ms
[shading] shading=gouraud: FPS: 12560 FrameTime: 0.080 ms
[shading] shading=blinn-phong-inf: FPS: 13739 FrameTime: 0.073 ms
[shading] shading=phong: FPS: 14522 FrameTime: 0.069 ms
[shading] shading=cel: FPS: 15493 FrameTime: 0.065 ms
[bump] bump-render=high-poly: FPS: 15136 FrameTime: 0.066 ms
[bump] bump-render=normals: FPS: 14676 FrameTime: 0.068 ms
[bump] bump-render=height: FPS: 14378 FrameTime: 0.070 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 14517 FrameTime: 0.069 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 14035 FrameTime: 0.071 ms
[pulsar] light=false:quads=5:texture=false: FPS: 13725 FrameTime: 0.073 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 8767 FrameTime: 0.114 ms
[desktop] effect=shadow:windows=4: FPS: 8668 FrameTime: 0.115 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1203 FrameTime: 0.831 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 1367 FrameTime: 0.732 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1378 FrameTime: 0.726 ms
[ideas] speed=duration: FPS: 4267 FrameTime: 0.234 ms
[jellyfish] <default>: FPS: 11820 FrameTime: 0.085 ms
[terrain] <default>: FPS: 2626 FrameTime: 0.381 ms
[shadow] <default>: FPS: 11613 FrameTime: 0.086 ms
[refract] <default>: FPS: 5864 FrameTime: 0.171 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 14304 FrameTime: 0.070 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 14685 FrameTime: 0.068 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 14605 FrameTime: 0.068 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 15519 FrameTime: 0.064 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 14297 FrameTime: 0.070 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 15010 FrameTime: 0.067 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 13859 FrameTime: 0.072 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 13892 FrameTime: 0.072 ms
                                  glmark2 Score: 11661 

So I do a bit of video recording (previously in 720p) on Fedora with OBS on my Dell Inspiron 3180 every so often and it has the AMD A9 9420e apu. Since I upgraded to Fedora 34 I am trying to quantify why it has completely dropped off a cliff (so to speak) in terms of performance compared to Fedora 33. I could in the past record video and play video at the same time and the output was in 720p and I had no stuttering or video lag or anything like that.

However now it’s completely unusable with OBS whether in x11 or Wayland because it can’t even play video whilst recording and I can’t even play video at 1080p (scaled) any longer. So I don’t know what is the cause but there is a huge performance degradation in this system between Fedora 34 and Fedora 33 with the AMD graphics.

Oh and I have a desktop that in certain circumstances I only run on the iGPU and it is performing just fine with OBS on F34 so it’s definitely something to do with the AMD GPU and not the software itself.

Do you have some public 1080p videos links that I can try playing with my AMD A10-K7850K APU and see how fast it can play?

After your comment, I looked closer at the video I was testing with and it turns out the video plays at ‘1080p60’ so I looked around and found a few more channels that used the same video resolution. Turns out any video that is playing at 30 fps or below doesn’t have this issue. As soon as I switch the resolution to use either 1080p60 or 720p60 and try to use OBS at the same time, my system grinds to a halt making it almost impossible to record any video and at 1080p60, even watching it becomes a choppy mess. So I guess it might be that the GPU itself just can’t keep up with encoding and decoding at 60 fps. Not sure if this is a codec issue or just that the hardware can’t do it. I’m fairly certain all the other videos I have recorded previously are between 25 and 30 fps at the source because I haven’t really noticed many videos out there before indicating 1080p60 or 720p60.

The specific videos I’m talking about are from video game channels on youtube, I was using them to test my OBS setup options because they had quite a few different resolution to choose from and that’s when i noticed the performance issue for the first time.

Today I found a video at 1080p 25 fps and tested my recording options on that and found a sane default that actually works for that resolution (although I have to use the ffmpeg vaapi encoder rather than the x264 software option) pretty well and even better at 720p.

1 Like

I tried using the Flathub version of Firefox with this Video

With 1080p60, I got about 90% dropped frames.
With 720p60, frames are dropped when resizing video size. After a while it stablised and drop frames are almost zero at full screen for me.

1 Like

My experience is much the same, although I think for me 720p60 gives me around 5% dropped frames (at a rough guess) (at around 1:50 I’m seeing 1912 of 4295 frames dropped) but definitely at 1080p60 it’s unwatchable for me.

Here’s a specific example that is completely broken for me:

I noted that the other video I was using was also using the VP9 codec whereas the one you link above appears to be using avc1 which seems to be working a little better for me

1 Like

For your linked video, the Flathub Firefox hangs when playing in 1080p60 .
While using Fedora’s Firefox, it plays nicely, with only 1 dropped frames per 10s on average.

I don’t have any extra codecs installed so far. So it seems to me, if the RPM Firefox is OK to play a video, it should be used. Only when there are codec issues, switch to the Flathub version.

I was using the rpm firefox and not the flathub version. I had however enabled ‘gfx.webrender.all’ in about:config. Turns out turning this off and running in basic mode (the default) is a lot more performant for my system when playing videos. I turned it on because it helps with flickering in wayland when browsing via firefox wayland, but I’m currently running in X11 for my testing. That being said it goes to about 80% lost frames in 1080p60 in basic mode, but in 720p60 it’s now fine.

1 Like