MoinQ:

DNS/毒盛/2020/saddns.net/5

5.

5.1 Extending Window in a Forwarder Attack

We propose a novel strategy as follows:

the attacker first sends a query of his own domain, e.g., www.attacker.com to the forwarder, which will eventually trigger the upstream resolver to query the attacker-controlled authoritative name server.

The name server is intentionally configured to be unresponsive so that the forwarder would wait maximum amount of time possible (as the resolver is also halted) while leaving an open source port.

At a first glance, this is pointless because we are not interested in poisoning an attacker’s own domain. However, due to the unique role of DNS forwarders [34], they rely completely on upstream resolvers to perform validations on responses.

More specifically, according to RFC 8499 [34], recursive resolvers’ responsibility is to handle the complete resolution of a name and provide a “final answer” to its client.

This includes recursively handling referrals and CNAMEs and assemble a final answer, including any CNAME redirects by design. More importantly, resolvers are required to perform integrity checks such as the bailiwick check [25], whereas forwarders are not. This means that forwarders by design trust the upstream resolvers and its response.

This is not a security flaw; rather, it is a design choice to prevent forwarders from duplicating the work of resolvers. This observation is also made a in recent study dedicated to the security of DNS forwarders [60].

As a result, a rogue response (potentially injected by an attacker from either LAN or outside) shown in Figure 5 will be accepted by a forwarder and both the attacker’s and victim’s domain records will be cached.

This strategy is extremely effective because we can impose the maximum wait time on the forwarder (i.e., creating the largest possible attack window).

Specifically, most forwarders have a very lenient timeout (sometimes close to a minute e.g., in dnsmasq), and will stop mostly because the upstream resolver failing

first (ranging from 5 to 30 seconds) generating a SERVFAIL response (or NXDOMAIN) message. To prevent resolvers from generating such messages too early, we also employ a technique that can sometimes keep a resolver engaged longer.

The trick is to have the attacker-owned authoritative name server respond in a slow pace with a chain of CNAME records, creating an illusion that it is making progress. This can delay resolver’s response for over a minute in some cases (e.g., CloudFlare).

5.2 Extending Window in a Resolver Attack

We propose to take advantage of the security feature of rate limiting in authoritative name servers, as a way to mute name servers and extending window in a resolver attack.

Modern DNS name server software such as BIND, NSD, PowerDNS, all support a common security feature called response rate limiting (RRL) [57, 59], as a mitigation of the DNS amplification attack [57] where a large number of malicious DNS queries are issued to authoritative name servers spoofing a victim’s IP address.

To limit the number of amplified DNS reply packets, the RRL feature allows a configurable per-IP, per-prefix, or even global limit of triggered responses. Specifically, if the limit is reached, then responses are either getting truncated or dropped. There are also dedicated DNS firewalls with similar features [14].

Ironically, this feature can be leveraged maliciously to mute a name server if an attacker can inject spoofed DNS queries (with the target resolver’s IP) at a rate higher than the configured limit.

Depending on the actual limit (some are configured to be very low), it may be trivial to create a sufficiently high “loss rate” so that the resolver’s legitimate query has an extremely low probability of

In addition, we also inspect the remaining cases where the loss rate increased from the 1kpps test to the 4kpps one. There are roughly 5,000 cases where the diff is 2% or higher. We believe that the majority of them can be further increased given increased probe rate, and therefore potentially vulnerable as well. Therefore, we have a total of 18110 (13,110 and 5,000) cases out of the 100K (18%) which we consider vulnerable.

Finally, out of the 75% cases where both 1kpps and 4kpps tests experienced no loss, we believe there may be many more vulner- able cases which we simply cannot uncover due to the relatively low probing speed.

Due to ethical concerns, however, we refrain from probing at an even higher speed. To peek into those cases, we manage to obtain permission from a collaborator to test an authoritative name server configured for non-profit website.

We are able to probe the server at a much higher rate (late at night to avoid disruption). Initially when probed at a rate 4kpps, no loss is observed. Interestingly, it started to experience loss when the probing rate is increased to 25kpps. Specifically, when the rate is increased to 50kpps, the loss rate jumps to 75%. We checked with our collaborator on whether the server is indeed configured to use such a high rate limit. To our surprise, there is no rate limit config- ured at all. To understand this behavior, we replicate a BIND server locally (replicating the configuration) and verified that indeed it is fairly easy to trigger high loss rate with comparable probing speeds.

We find that it is because the application (i.e., BIND) not reading from the socket queue fast enough, which causes overflows. Indeed, historical DoS attacks similar to this, e.g., by flooding queries with random names, have been observed in practice [44].

To mitigate such threats, the official BIND explicitly guideline recommends rate limit [52], which would paradoxically make it vulnerable to our attack instead.

In addition, we can leverage this technique to extend the attack window against a forwarder since RRL is also deployed on resolvers to limit the rate of incoming queries. By following the same proce- dure and ethical standard in the previous measurements and a rate of 4kpps probing against the resolver IPs obtained on 14, 2019 from Censys [23], we observe surprisingly 121,195 out of 136,547 exhibit a loss rate of more than 66.7%, indicating it is generally possible to mute resolvers on the Internet.

6 PRACTICAL ATTACK CONSIDERATIONS

Bypassing the TTL of cached records. If an attacker attempts to poison a benign domain such as www.victim.com by directly triggering DNS queries of www.victim.com on a resolver, it may cache the unwanted legitimate A record, for example, due to oc- casional failures to mute their upstream servers. This forces the attacker to wait for the cache timeouts before initiating the next attack attempt.

getting a response. To understand how likely such a strategy can

MoinQ: DNS/毒盛/2020/saddns.net/5 (last edited 2020-11-17 00:00:52 by ToshinoriMaeno)