Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲The iPad was on Tailscale: a WebRTC debugging story (p2claw.com)

70 points by syllogistic 2 days ago | 34 comments

inigyou 2 days ago [-]

I don't understand how a product as popular as Tailscale can get this far while dropping certain ordinary types of packets.

It is impossible to parse the UDP or TCP port number out of a fragment. This is surely the reason the ACL module entirely rejects them. TCP will adjust it's segment size based on PMTUD so as to not require fragmentation. This is why it hasn't been noticed so far. But fragmented UDP packets are a corner case of normal behavior and it boggles the mind that someone could just decide to completely drop them.

UDP fragment filtering could be implemented by a global fragments on/off setting (works for "allow everything" = fragments on, cautious = fragments off) or by blocking the first fragment which includes the port number (and blocking it if the port number is split across fragments which I think is technically allowed but completely abnormal).

happyopossum 2 days ago [-]

> I don't understand how a product as popular as Tailscale can get this far while dropping certain ordinary types of packets.

I’d venture to guess based on this outcome that fragmented UDP over IPv6 isn’t really an ordinary occurrence. Given the preponderance of HTTPS traffic, the aversion to fragmentation in IPv6, and the weird corner case of there being a hardcoded packet size in webrtc, it’s reasonable to assume that this is a corner case.

A good one to be aware of, but not common.

syllogistic 2 days ago [-]

Would agree it's uncommon in general traffic. Rare conditions [webrtc-rs, 1280 class tunnel / tailscale, and ipv6 pair] but deadly when they are met since every connection silently fails. That's what made it worth chasing down for 2 weeks [and good for sleuthing :)].

inigyou 2 days ago [-]

It's a corner case of ordinary traffic, since all TCP apps and most UDP apps adapt to PMTU, but fragmentation is there for those that don't. It's not like something you can only get by generating malicious traffic intentionally.

Welcome to networking mistakes, I guess. I can't remember the specifics but I once encountered a router that would drop traffic that looked like encapsulated TCP at a certain offset, or something like that. They couldn't fix it because the behavior was hardwired. I knew of it because I worked with the firmware team.

Factorio discovered that UDP packets with a checksum of 0x0000 get dropped by some devices.

toast0 1 days ago [-]

Dropping fragments is a pretty normal thing to do in a lot of places. If you have a stateful firewall, you can't tell if a fragment is viable until you reassemble it, and reassembly is unreasonably expensive, so dropping fragments it is.

Personally, I prefer to go ahead and reassemble, but with a very minimal reassembly buffer.

Very few packets get fragmented, so if you have more than 16 fragments in your reassembly buffer, you're probably being ddosed and you can toss them. OTOH, if you have a 16 deep reassembly buffer, you're probably more generous than most services that have no buffer for reassembly.

It's not what the RFCs say to do, but the IPv6 RFCs are like 30 years old, and the IPv4 RFCs even older. They were written in a different time for an internet that was less adversarial; some things don't make sense to keep doing.

syllogistic 2 days ago [-]

Author here,

Agreed. The port-number point is the most plausible rationale I've heard, more convincing than the RFC line in their source comment. The historical fix for "can't classify fragments" was virtual reassembly or flow tracking [conntrack on linux, scrub in pf], so dropping them outright punts past known prior approaches. Even your lighter idea would have saved us: a first-fragment match would have let our pair through.

We've reported upstream to both projects, tailscale/tailscale#20083 and webrtc-rs/webrtc#806, and webrtc-rs already invited a PR.

inigyou 2 days ago [-]

You are shadowbanned.

Sean-Der 2 days ago [-]

Amazing debugging, I loved reading that. HN doesn't get enough good posts like this anymore :)

If https://github.com/pion/sctp/issues/12 had happened (not just in Pion but across all implementations) this could have been fixed years ago. The hardcoding we all settle for is tragic.

syllogistic 2 days ago [-]

Author here, thank you, that means a lot coming from you. Pion was the prior art I pointed the webrtc-rs maintainers at. And pion/sctp#12 is super relevant. A known, proposed fix years before we hit it.

"The hardcoding we all settle for" might be the epigraph for the whole incident. webrtc-rs invited a PR for the configurable-MTU + better default half [webrtc-rs/webrtc#806] to unblock folks today. Whether PMTUD gets implemented will be interesting to see.

sulam 2 days ago [-]

The debugging was interesting. I'm just going to have to learn to live with this I feel like, but the very LLM-ish language in the blog post was kind of annoying.

katericksonnow 2 days ago [-]

MTU black holes are the worst because every health check is small enough to survive.

toast0 1 days ago [-]

I'm pretty sure Google's WebRTC doesn't respond to icmp needs frag (but it does have a better default size), and I'm guessing webrtc-rs doesn't either. Seeing the icmp in tcpdump might have raised some alarms.

WebRTC also uses small packets for ICE pings, so if you have a path mtu problem, it won't affect connection selection, so that's also fun.

hylaride 2 days ago [-]

I'm having flashbacks to 1990s-era PPPoE, where the slightly smaller MTU had issues with some server OS's that had TCP/IP stacks that didn't support or ignored MTUs smaller than 1500 bytes and bulk data transfers would get messed up. I don't remember which ones, but it was some commercial UNIX.

toast0 1 days ago [-]

Windows clients don't request the MTU from dhcp either, so that's also fun.

After hitting broken pmtud enough, I resolved to make a browser based test, and eventually I did. http://pmtud.enslaves.us/

But that wouldn't help this investigation, since there's no attempt to find the path mtu in webrtc-rs (or general webrtc)

wmf 2 days ago [-]

Weren't T1s running 576 MTU back then?

hylaride 2 days ago [-]

IIRC, T-lines had no inherent MTU. The MTU is determined by whatever layer-2 encapsulation ran over the T1:

PPP → default MTU 1500 Cisco HDLC → 1500 Frame Relay → typically 1500+ (often configurable higher)

So a typical IP MTU on a T1 link was 1500 bytes, same as Ethernet — chosen partly so packets could traverse Ethernet ↔ T1 boundaries without fragmenting.

576 is from RFC 791 (IPv4):

* Every IP host must be able to reassemble a datagram of at least 576 bytes.

* 576 became the conventional default MTU for "non-local" destinations — i.e., when a host didn't know the path MTU and wanted a value virtually guaranteed not to be fragmented anywhere. (576 − 20 IP − 20 TCP = 536 bytes of payload, the classic TCP default MSS.)

You'd also see 576 as a common default on dial-up/PPP links and X.25, which is probably the source of the association — not T1.

But re-assemble didn't necessarily mean transmit. My understanding (and this is quoting from memory from over 20 years ago) is that some commercial UNIXs from eons ago didn't ever really test dialup or other other such settings as they were often in more commercial settings and other protocols were often used before everything converged on IP. I'm sure these were also unpatched machines.

It was just annoying enough where some random connections didn't work very well.

bastard_op 1 days ago [-]

Developers should spend more time being network engineers before writing network code. I saw this title and my first thought was, "What, mtu?"

If you've ever had to support vpn's in an enterprise securing businesses with ipsec or sslvpn with tunnel overhead, you've run into mtu issues. Some apps/protocols or firewalls misbehave, devs/engineers didn't read the memo from 20 years ago in rfc form how ipv6 mtu's work (and missed v4 to boot, lucking out with 20 more years of someone else fixing it).

Not Tailscale or Cisco or in between are immune to mtu issues in vpn or networking.

talkingtab 2 days ago [-]

Hunting good bugs is something every good software developer should experience. A good interview question is "tell me your favorite bug". Bugs are about reasoning, not intelligence. And I will take someone who can tell me what is wrong over what is correct any day. It requires a focus on getting things actually work.

I have two favorite bug stoies. The first is from a printout from the run of an IBM 360 assembly language program when I was just learning. Someone asked em why their program failed to run. I glanced quickly at the front page of the printout and it said "Too Long". So I told the person that was the problem. Something was too long. He looked at me very strangely, so I looked back at the page a little more closely, only to notice "Too Long" was in the name field of the person running the program. He was Vietnamese and his name was Too Long - literally. There is a powerful lesson (at least one) there.

The other happened when I was implementing some AppleTalk protocols - NBP to be exact. (Don't ask). I would capture the working packets then compare all the checksums, headers, constants, length fields in the packet my code generated and fix any problems. I was stuck on one failure. I just could not see any difference as I went through byte by byte, time after time. It was late and time to go home so I decided to print off each packet on paper and compare them later - certain I was missing something. The problem was instantly obvious. One printout took a page, the two pages. I had been appending junk data in the packet. Sigh

win311fwg 2 days ago [-]

> A good interview question is "tell me your favorite bug".

I wouldn't have a clue how to recall any details about the bugs I've seen. I don't put much emphasis on past events. Looking forward is what I find to be a far more valuable use of my mental energy. I have vague recollections of debugging some doozies, but that is where the recall ends. It is clearly something you are passionate about, which no doubt keeps it something front of mind for you, but for many of it is just part of the job; like asking someone at McDonald's how their favourite burger flip landed.

You could say that I'm not the one of the for the job, which is a fair take, but if we reason through this some more, would we not conclude that there is no such thing as a good canned interview question? Given that no two people are the same, good interview questions can only be established in the context of who is being interviewed.

toast0 15 hours ago [-]

I mean, you do you, but I'm not sure how to take this. Maybe I'm a master debugger, but a lot of the problems I run into and debug tend to reoccur, sometimes in different circumstances, sometimes in the same circumstances (which can be amazingly frustrating, but...).

Remembering and being able to tell the narrative about how I figured out why something that people like to do is a really bad idea is very helpful to convince people not to repeat the mistakes of the past when they aren't receptive to "trust me, this is a bad idea and we shouldn't do it" or "if you do that, let me know when you undo it, otherwise don't call me"

Personally, I don't have any skill at giving this kind of story time interview question, so I don't. But it does seem concerning to me if someone has 5-10 years of software experience and can't articulate any debugging stories. How were you working where you never ran into a problem that took you/your team 2 weeks of pain to figure out?

win311fwg 7 hours ago [-]

> but a lot of the problems I run into and debug tend to reoccur

Whereas with directing my attention to looking forward to the future I don't see the same issues turn up ever again. I learn from my mistakes. Now that I am multiple decades into my career, I see almost no bugs turn up at all. That doesn't mean I will never encounter a bug again. It is inevitable that I will. But they will be novel when I do. Novel still doesn't mean particularly interesting, however. The burger flip landing a new way never seen before still isn't likely to register.

> it does seem concerning to me if someone has 5-10 years of software experience

Is that where you are at in your software career? I can attest that I saw a lot more bugs when in I was in that stage. One is still quite green and learning a tremendous amount even 5-10 years in. However, that was a long time ago. I've forgotten the details by now. This tune might change for you as well as you progress further.

Or maybe not. Everyone is different. You do seem passionate about bugs. I, on the other hand, hate them, so I lean heavily into processes to avoid them to the greatest extent possible.

toast0 6 hours ago [-]

I'm 22 years into real professional software development (I've semi-retired, but the world still needs debugging). Plus a few years of junior level IT/sysadmin stuff. Of course, my code is perfect by now. But my code runs on other people's code. And nobody else writes perfect code.

So I have to debug other people's libraries and operating systems. And other people's networks. Turns out other people often make similar mistakes. Some people say 'select isn't broken', but lots of things are[1]. Most of my debugging stories would tend to be centered around a problem that my team found/uncovered, not one that we created... although certainly I did make some bugs in my youth (definitely none lately!).

I put 5-10 years there because someone under 5 years of experience could maybe not have ever run into a troublesome issue, or they always had a senior to do the hard stuff. Between 5 and 10 years, maybe they find their first tricky bug. After 10+ years, you've got to have run into something.

[1] Here's some war stories:

I fixed an interop issue between OpenSSL and Microsoft schannel where rsa dhe would fail if the generated public key had leading zeros; OpenSSL would encode it in fewer bytes and schannel would return 'out of memory'. The RFC was vague. People had observed the failures for years, but I had to fix it. At the time, it was considered a reasonable optimization to generate a dhe keypair and reuse it for the lifetime of the server process... If we generated a problematic keypair on a given server, windows clients couldn't connect at all. Now, if I run into an issue and a working trace has structures of nice power of 2 lengths and a broken trace has one a little smaller, that's where I dig.

I found (but didn't make a patch) a bug in Firefox where POSTs to an http/2 server with tls 1.3 early data enabled would stall for about a minute when there was no connection to reuse. Fixing it was out of my league, but I was able to get it fixed by giving a clear bug report. This one was fairly new when I saw it, but there was a much less clear bug open against Thunderbird caused by the underlying issue. Not sure what I learned here really other than if you're expecting data sent to the network and it doesn't happen, it's usually an application problem... and clear bugs with clear logs help get things fixed.

I fixed an issue with FreeBSD where it would send the whole sendq when it received an icmp needs frag message, even when the maximum mtu sent in the icmp was the same as or greater than the current path mtu. This was happening when a Linux router was using large receive offload to aggregate inbound packets on a flow and then they were too large to forward; that bug was fixed long before I experienced it, but the router in question never got updated. I could not get ahold of the operator for them to fix the broken machine, but I was able to get a patch into FreeBSD so that the broken router(s) only impacted our customers that were behind it. ... this is another indication that PathMTU is hard, but also it helped me tune methods of sampling packets from production. PS, pathmtu issues are their own repetitive problem space.

That one time FreeBSD broke syncookies, so connections got reformed after close, and the tcp state was unsynchronizable between peers so they kept sending challenge acks... and IIRC, they broke it a second time, too. But maybe it was just we ran into it in a different context.

I've recently found some issues leading to out of order packet delivery with FreeBSD's dummynet traffic shaping; again, other people already experienced it, but nobody wrote a good bug report or submitted a patch, so I guess I'll have to do it, if it's still broken when I have time for it. This one is probably not going to be a repeating bug... not a lot of traffic shapers, but maybe there will be something learned about scheduling i/o

What processes could I use to avoid bugs like these? Hoping things magically get fixed in an update does sometimes work, and sometimes the bug becomes less relevant as the industry moves on (ecdhe has almost completely replaced rsadhe, but it hadn't at the time that my customers ran into the bug).

win311fwg 5 hours ago [-]

> So I have to debug other people's libraries

Been there, but then I soon learned to reduce dependencies to essentially none. Those that do make the cut need to be of high quality such that the authors of those libraries are also as perfect as you are. There is absolutely no need to depend on code written by those with <=10 years under their belt. The world is full of developers with 20+ years of experience.

> After 10+ years, you've got to have run into something.

Sure. And after 10+ years of flipping burgers, there will have been some pretty sweet lands. Who is going to remember, though? It is fine if you do. Everyone has their thing. But I'd say it is not exactly among the most memorable of events. It is not like time spent with your child, or something that actually has some kind of meaning. You even say you are semi-retired, so you must agree that things on the job don't really matter. If it did, why not dedicate every possible moment to it?

syllogistic 2 days ago [-]

PR for the webrtc-rs half is up!

https://github.com/webrtc-rs/webrtc/pull/807

OptionOfT 2 days ago [-]

One day I hope to work on problems like this. Fantastic article.

2 days ago [-]

Veserv 2 days ago [-]

Ah yes, the horrible anti-feature of IP fragmentation strikes again.

Pair it with the anti-solution of dropping large packets instead of truncating them and we get our perfect storm of bad design that is MTU incompatibility and modern MTU discovery.

cyberax 2 days ago [-]

Another fun happy iOS story: we were launching our app a year ago, with a self-imposed deadline. As usual, tons of bugs were being fixed in the last moment.

And then our authentication stopped working on simulated iOS devices (while still working on the real devices!). After hours of frantic debugging and staring at Wireshark dumps, I found the issue: HTTP3 and QUIC. Apparently, the simulated stack was not tracking the MTU correctly and was trying to send 1506-byte UDP packets.

The "fix" was to add deny rules for UDP ports 80/443 to our firewall.

syllogistic 2 days ago [-]

Author here.

This started as a blank page on one device and ended two weeks later at the intersection of two bugs: webrtc-rs hardcodes INITIAL_MTU=1228 [never updated, no path probing, retransmits at the same size forever], and Tailscale's packet filter classifies any IPv6 packet with a Fragment header as unknown protocol, so the default deny fires. On every platform, counted under reason="acl". Neither is unreasonable alone. Together: silent wedge, every health check green, because everything that tests the path is small and only the payload fragments. Two-command repro on any tailnet: ping -s 100 works, ping -s 1400 over the Tailscale IPv6 address is 100% loss. Full WebRTC repro and captures: https://github.com/phact/mtu-webrtc-bug. We've reported upstream to both projects https://github.com/tailscale/tailscale/issues/20083 and https://github.com/webrtc-rs/webrtc/issues/806. Happy to answer questions. Especially interested if anyone knows the history behind the IPv6 fragment decision in Tailscale's filter.

lostmsu 2 days ago [-]

Just tried ping -s 1400 over Yggdrasil (IPv6 overlay) and it works. Reinforces my overlay choice.

Getting screwed by browsers though because their WebRTC implementations completely ignore Yggdrasil addresses.

cyanydeez 2 days ago [-]

just wait till you try to send a data packet in webrtc that's too large in the browser. https://stackoverflow.com/questions/35381237/webrtc-data-cha...

last I checked, all browsers silently fail if it's too big.

Sean-Der 2 days ago [-]

This should be fixed!

I added this in Pion here[0] and I remember testing against Chrome + FireFox and it seemed to work great!

[0] https://github.com/pion/webrtc/commit/e4ff415b2bff31382bdb80...

syllogistic 2 days ago [-]

Good to know, thanks!

Though maybe I’ll keep my old limits for old browser compatibility.

syllogistic 2 days ago [-]

Yes! that's the "other reconstruct" I mention on the post. maxMessageSize at least appears in SDP and getStats. We ended up patching both at our client to be safe [800 bytes and 16kb respectively].

2 days ago [-]

ryanshrott 22 hours ago [-]

[flagged]

torlok 2 days ago [-]

Reading titles like these makes me feel like I'm having a stroke.

Rendered at 12:17:59 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.