DNS resolution broken

Hi,

for some time I sometimes have issues with DNS for some sites. It always seem to affect the same domains and seems to never happen to others.

One that is regulary affected is mirrors.fedoraproject.org. It is somehow related to systemd-resolve and I am not sure how to debug that.

Querying through systemd-resolve:

# dig @127.0.0.1 mirrors.fedoraproject.org

; <<>> DiG 9.16.11-RedHat-9.16.11-5.fc34 <<>> @127.0.0.1 mirrors.fedoraproject.org
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

Querying the upstream DNS directly:

# dig @192.168.1.2 mirrors.fedoraproject.org

; <<>> DiG 9.16.11-RedHat-9.16.11-5.fc34 <<>> mirrors.fedoraproject.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37827
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 14, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;mirrors.fedoraproject.org.	IN	A

;; ANSWER SECTION:
mirrors.fedoraproject.org. 297	IN	CNAME	wildcard.fedoraproject.org.
wildcard.fedoraproject.org. 57	IN	A	185.141.165.254
wildcard.fedoraproject.org. 57	IN	A	18.133.140.134
wildcard.fedoraproject.org. 57	IN	A	209.132.190.2
wildcard.fedoraproject.org. 57	IN	A	18.159.254.57
wildcard.fedoraproject.org. 57	IN	A	38.145.60.20
wildcard.fedoraproject.org. 57	IN	A	38.145.60.21
wildcard.fedoraproject.org. 57	IN	A	67.219.144.68
wildcard.fedoraproject.org. 57	IN	A	18.185.136.17
wildcard.fedoraproject.org. 57	IN	A	140.211.169.206
wildcard.fedoraproject.org. 57	IN	A	152.19.134.142
wildcard.fedoraproject.org. 57	IN	A	85.236.55.6
wildcard.fedoraproject.org. 57	IN	A	152.19.134.198
wildcard.fedoraproject.org. 57	IN	A	8.43.85.67

;; Query time: 25 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mi Jun 23 12:11:12 CEST 2021
;; MSG SIZE  rcvd: 285

config:

# resolvectl 
Global
       Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (wlp2s0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.1.2
       DNS Servers: 192.168.1.2 2a02:908:1570:5b60:d63f:cbff:fe8d:4c20

Link 11 (enp62s0u1)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Workarounds:

  1. in firefox setting Enable DNS over HTTPS to be able to use a browser normally (otherwise I could not even access this page as id.fedoraproject.org would not resolve for me most of the time as well)
  2. editing /etc/resolve.conf and changing nameserver makes most software work - interestingly not everything. E.g. curl seems to somehow still use systemd-resolve as it keeps insisting that names are not resolveable. Ideas?

Any ideas how to debug this / dig deper? I would like to understand what exactly is the issue not just have workarounds :slight_smile:

Thanks in advance!

Try to use a public DNS provider:

sudo nmcli connection show
sudo nmcli connection modify id CON_NAME \
    ipv4.ignore-auto-dns yes \
    ipv6.ignore-auto-dns yes \
    ipv4.dns 8.8.8.8,8.8.4.4
sudo nmcli connection up id CON_NAME

Enable DoT if the issue persists:

sudo mkdir -p /etc/systemd/resolved.conf.d
sudo tee /etc/systemd/resolved.conf.d/00-custom.conf << EOF
[Resolve]
DNSOverTLS=yes
EOF
sudo systemctl restart systemd-resolved.service
2 Likes

I believe this is normal behaviour.
Dig queries dns server.
Your dns server is 192.168.1.2, while 127.0.0.1 is your loopback interface of your client.
If you use dig on a client, it can’t get information which only a dns server could provide.

In some distributions every client also runs a local server, most of the time dnsmasq, which usually listens on 127.0.0.1.

But that has changed with fedora. They now use systemd-resolve, as you said yourself.

And this one listens on 127.0.0.53

However, there also might be a local dns cache, like nscd, that caches dns queries for a period of time.

As said, the local client now uses systemd-resolve.
It has another config file in /etc/systemd/resolved.conf ...
So this local “dns provider” uses the local nameserver: 127.0.0.53, not 127.0.0.1, as mentioned in /etc/resolv.conf btw.

Long story short, if you want to query your “client’s local dns server”, query this:
dig @127.0.0.53 mirrors.fedoraproject.org

Reason behind, the loopback interface is not a single IP, but a whole network, 127.0.0.0/8.

2 Likes

Hi,

thanks for the pointer!

So I just had the issue again. And it is reproducable with dig:

$ dig mirrors.fedoraproject.org

; <<>> DiG 9.16.16-RH <<>> mirrors.fedoraproject.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 42685
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;mirrors.fedoraproject.org.	IN	A

;; Query time: 180 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Do Jun 24 19:01:25 CEST 2021
;; MSG SIZE  rcvd: 54

The answer states:

;; SERVER: 127.0.0.53#53(127.0.0.53)

But asking systemd-resolve directly via @127.0.0.53 works though:

$ dig @127.0.0.53 mirrors.fedoraproject.org

; <<>> DiG 9.16.16-RH <<>> @127.0.0.53 mirrors.fedoraproject.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8918
;; flags: qr rd ra; QUERY: 1, ANSWER: 14, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;mirrors.fedoraproject.org.	IN	A

;; ANSWER SECTION:
mirrors.fedoraproject.org. 300	IN	CNAME	wildcard.fedoraproject.org.
wildcard.fedoraproject.org. 21	IN	A	18.133.140.134
wildcard.fedoraproject.org. 21	IN	A	67.219.144.68
wildcard.fedoraproject.org. 21	IN	A	38.145.60.20
wildcard.fedoraproject.org. 21	IN	A	8.43.85.67
wildcard.fedoraproject.org. 21	IN	A	152.19.134.142
wildcard.fedoraproject.org. 21	IN	A	209.132.190.2
wildcard.fedoraproject.org. 21	IN	A	18.185.136.17
wildcard.fedoraproject.org. 21	IN	A	140.211.169.206
wildcard.fedoraproject.org. 21	IN	A	18.159.254.57
wildcard.fedoraproject.org. 21	IN	A	185.141.165.254
wildcard.fedoraproject.org. 21	IN	A	85.236.55.6
wildcard.fedoraproject.org. 21	IN	A	152.19.134.198
wildcard.fedoraproject.org. 21	IN	A	38.145.60.21

;; Query time: 71 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Do Jun 24 19:01:33 CEST 2021
;; MSG SIZE  rcvd: 285

The manual of dig states:

If no server argument is provided, dig consults /etc/resolv.conf

As the resolv.conf states 127.0.0.53 - how come it results in different results?

$ cat /etc/resolv.conf 
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search local

Running dig mirrors.fedoraproject.org now works as well. I do not know if this is related to the randomess of the issue or if was result of running the command several times and it always works after several attemps - but from my previous experience that was not the case.

Post the updated diagnostics:

resolvectl --no-pager status; resolvectl query openwrt.org

Okay, my previous post was not quite on spot - the @127.0.0.53 seemed to just have worked due to the randomness of the issue.

The issue just occured again. Neither dig mirrors.fedoraproject.org nor dig @127.0.0.53 mirrors.fedoraproject.org worked. Tried both multiple times.

Some more diagnostics:

$ resolvectl query openwrt.org
openwrt.org: 139.59.209.225                    -- link: wlp2s0
             2a03:b0c0:3:d0::1af1:1            -- link: wlp2s0

-- Information acquired via protocol DNS in 113.9ms.
-- Data is authenticated: no; Data was acquired via local or encrypted transport: no
-- Data from: network
$ resolvectl query mirrors.fedoraproject.org
mirrors.fedoraproject.org: resolve call failed: Received invalid reply
$ resolvectl --no-pager status
Global
       Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (wlp2s0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.1.2
       DNS Servers: 192.168.1.2 2a02:908:1570:5b60:d63f:cbff:fe8d:4c20
        DNS Domain: local

Link 3 (virbr0)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

You have an IPv6 in your config.
Is this intentional?

This might be the reason in case this address does not deliver any dns data.
The resolver may use round robin to query this and that address

So it happened again. And I looked into the IPv6 resolver thing:

$ resolvectl query mirrors.fedoraproject.org
mirrors.fedoraproject.org: resolve call failed: Received invalid reply
dig @2a03:b0c0:3:d0::1af1:1 mirrors.fedoraproject.org

; <<>> DiG 9.16.18-RH <<>> @2a03:b0c0:3:d0::1af1:1 mirrors.fedoraproject.org
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

dig at 192.168.1.2 worked.

The IPv6 is coming from my ISP’s router and I cannot disable it. Some IPv6 auto config thing. I overwrote the DNS setting locally on my laptop.

$ resolvectl --no-pager status
Global
       Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 2 (wlp2s0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.1.2
       DNS Servers: 192.168.1.2

But I again having the issue:

$ resolvectl query mirrors.fedoraproject.org
mirrors.fedoraproject.org: resolve call failed: Received invalid reply

Does the issue persist if you replace the local resolver with a public one?