Fix packet number length computation

This bug was found when debugging a flake in our end to end test that can occur with high packet loss. The issue is that the current code can lead to the client and server falling out of sync, and all packets failing to decrypt forever. If the client uses a packet number length that is too small, and the server misinterprets the packet number when decompressing it, decryption fails. The issue is that we were previously only tacking the unacked packets into account when computing the packet number length. Except, when we declare sent packets as lost, we no longer treat them as unacked for the purpose of packet number length computation. In the failing test, I saw that the client was sending ENCRYPTION_FORWARD_SECURE packet numbers in the range [256,512) with a packet number length of 1, and the server had never decrypted a single packet (from that packet number space) so it would decompress them in the range [0, 256). The fix is to take the min packet number between what we had and what the peer acknowledged. This ensures that, even if many packets are lost, we don't compress information without a positive signal from the peer. I've confirmed that the fix prevents the e2e test flake.

This CL also adds some additional debug logs that were instrumental to finding this bug.

Fix sent packet number computation, protected by gfe2_reloadable_flag_quic_fix_packet_number_length

PiperOrigin-RevId: 319904609
Change-Id: Ib05b295323043ae0645b4580f8252e9015d42ce1
12 files changed
tree: b1605f50c5364a4ef2c4a25372c8b5708b661097
  1. common/
  2. epoll_server/
  3. http2/
  4. quic/
  5. spdy/
  6. CONTRIBUTING.md
  7. LICENSE
  8. README.md
README.md

QUICHE

QUICHE (QUIC, Http/2, Etc) is Google‘s implementation of QUIC and related protocols. It powers Chromium as well as Google’s QUIC servers and some other projects.