Skip to content
  • Christoph Schulz's avatar
    net: pppoe: use correct channel MTU when using Multilink PPP · dc1a6f41
    Christoph Schulz authored
    [ Upstream commit a8a3e41c67d24eb12f9ab9680cbb85e24fcd9711 ]
    
    The PPP channel MTU is used with Multilink PPP when ppp_mp_explode() (see
    ppp_generic module) tries to determine how big a fragment might be. According
    to RFC 1661, the MTU excludes the 2-byte PPP protocol field, see the
    corresponding comment and code in ppp_mp_explode():
    
    		/*
    		 * hdrlen includes the 2-byte PPP protocol field, but the
    		 * MTU counts only the payload excluding the protocol field.
    		 * (RFC1661 Section 2)
    		 */
    		mtu = pch->chan->mtu - (hdrlen - 2);
    
    However, the pppoe module *does* include the PPP protocol field in the channel
    MTU, which is wrong as it causes the PPP payload to be 1-2 bytes too big under
    certain circumstances (one byte if PPP protocol compression is used, two
    otherwise), causing the generated Ethernet packets to be dropped. So the pppoe
    module has to subtract two bytes from the channel MTU. This error only
    manifests itself when using Multilink PPP, as otherwise the channel MTU is not
    used anywhere.
    
    In the following, I will describe how to reproduce this bug. We configure two
    pppd instances for multilink PPP over two PPPoE links, say eth2 and eth3, with
    a MTU of 1492 bytes for each link and a MRRU of 2976 bytes. (This MRRU is
    computed by adding the two link MTUs and subtracting the MP header twice, which
    is 4 bytes long.) The necessary pppd statements on both sides are "multilink
    mtu 1492 mru 1492 mrru 2976". On the client side, we additionally need "plugin
    rp-pppoe.so eth2" and "plugin rp-pppoe.so eth3", respectively; on the server
    side, we additionally need to start two pppoe-server instances to be able to
    establish two PPPoE sessions, one over eth2 and one over eth3. We set the MTU
    of the PPP network interface to the MRRU (2976) on both sides of the connection
    in order to make use of the higher bandwidth. (If we didn't do that, IP
    fragmentation would kick in, which we want to avoid.)
    
    Now we send a ICMPv4 echo request with a payload of 2948 bytes from client to
    server over the PPP link. This results in the following network packet:
    
       2948 (echo payload)
     +    8 (ICMPv4 header)
     +   20 (IPv4 header)
    ---------------------
       2976 (PPP payload)
    
    These 2976 bytes do not exceed the MTU of the PPP network interface, so the
    IP packet is not fragmented. Now the multilink PPP code in ppp_mp_explode()
    prepends one protocol byte (0x21 for IPv4), making the packet one byte bigger
    than the negotiated MRRU. So this packet would have to be divided in three
    fragments. But this does not happen as each link MTU is assumed to be two bytes
    larger. So this packet is diveded into two fragments only, one of size 1489 and
    one of size 1488. Now we have for that bigger fragment:
    
       1489 (PPP payload)
     +    4 (MP header)
     +    2 (PPP protocol field for the MP payload (0x3d))
     +    6 (PPPoE header)
    --------------------------
       1501 (Ethernet payload)
    
    This packet exceeds the link MTU and is discarded.
    
    If one configures the link MTU on the client side to 1501, one can see the
    discarded Ethernet frames with tcpdump running on the client. A
    
    ping -s 2948 -c 1 192.168.15.254
    
    leads to the smaller fragment that is correctly received on the server side:
    
    (tcpdump -vvvne -i eth3 pppoes and ppp proto 0x3d)
    52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
      length 1514: PPPoE  [ses 0x3] MLPPP (0x003d), length 1494: seq 0x000,
      Flags [end], length 1492
    
    and to the bigger fragment that is not received on the server side:
    
    (tcpdump -vvvne -i eth2 pppoes and ppp proto 0x3d)
    52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864),
      length 1515: PPPoE  [ses 0x5] MLPPP (0x003d), length 1495: seq 0x000,
      Flags [begin], length 1493
    
    With the patch below, we correctly obtain three fragments:
    
    52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
      length 1514: PPPoE  [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000,
      Flags [begin], length 1492
    52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864),
      length 1514: PPPoE  [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000,
      Flags [none], length 1492
    52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
      length 27: PPPoE  [ses 0x1] MLPPP (0x003d), length 7: seq 0x000,
      Flags [end], length 5
    
    And the ICMPv4 echo request is successfully received at the server side:
    
    IP (tos 0x0, ttl 64, id 21925, offset 0, flags [DF], proto ICMP (1),
      length 2976)
        192.168.222.2 > 192.168.15.254: ICMP echo request, id 30530, seq 0,
          length 2956
    
    The bug was introduced in commit c9aa6895
    
    
    ("[PPPOE]: Advertise PPPoE MTU") from the very beginning. This patch applies
    to 3.10 upwards but the fix can be applied (with minor modifications) to
    kernels as old as 2.6.32.
    
    Signed-off-by: default avatarChristoph Schulz <develop@kristov.de>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    dc1a6f41