Internet-Draft | EVPN VPWS Flexible Cross-Connect | September 2024 |
Sajassi, et al. | Expires 23 March 2025 | [Page] |
This document describes a new EVPN VPWS service type specifically for multiplexing multiple attachment circuits across different Ethernet Segments and physical interfaces into a single EVPN VPWS service tunnel and still providing Single-Active and All-Active multi-homing. This new service is referred to as flexible cross-connect service. After a description of the rationale for this new service type, the solution to deliver such service is detailed.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 23 March 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
[RFC8214] describes a solution to deliver P2P services using BGP constructs defined in [RFC7432]. It delivers this P2P service between a pair of Attachment Circuits (ACs), where an AC can designate on a PE, a port, a VLAN on a port, or a group of VLANs on a port. It also leverages multi-homing and fast convergence capabilities of [RFC7432] in delivering these VPWS services. Multi‑homing capabilities include the support of single-active and all‑active redundancy mode and fast convergence is provided using "mass withdrawal" message in control-plane and fast protection switching using prefix independent convergence in data-plane upon node or link failure [I-D.ietf-rtgwg-bgp-pic]. Furthermore, the use of EVPN BGP constructs eliminates the need for multi-segment PW auto‑discovery and signaling if the VPWS service need to span across multiple ASes [RFC5659].¶
Some service providers have very large number of ACs (in millions) that need to be back hauled across their MPLS/IP network. These ACs may or may not require tag manipulation (e.g., VLAN translation). These service providers want to multiplex a large number of ACs across several physical interfaces spread across one or more PEs (e.g., several Ethernet Segments) onto a single VPWS service tunnel in order to a) reduce number of EVPN service labels associated with EVPN-VPWS service tunnels and thus the associated OAM monitoring, and b) reduce EVPN BGP signaling (e.g., not to signal each AC as it is the case in [RFC8214]).¶
These service provider want the above functionality without scarifying any of the capabilities of [RFC8214] including single- active and all-active multi-homing, and fast convergence.¶
This document presents a solution based on extensions to [RFC8214] to meet the above requirements.¶
Two of the main motivations for service providers seeking a new solution are: 1) to reduce number of VPWS service tunnels by multiplexing large number of ACs across different physical interfaces instead of having one VPWS service tunnel per AC, and 2) to reduce the signaling of ACs as much as possible. Besides these two requirements, they also want multi-homing and fast convergence capabilities of [RFC8214].¶
In [RFC8214], a PE signals an AC indirectly by first associating that AC to a VPWS service tunnel (e.g., a VPWS service instance) and then signaling the VPWS service tunnel via a Ethernet A-D per EVI route with Ethernet Tag field set to a 24-bit VPWS service instance identifier (which is unique within the EVI) and ESI field set to a 10-octet identifier of the Ethernet Segment corresponding to that AC.¶
Therefore, a PE device that receives such EVPN routes, can associate the VPWS service tunnel to the remote Ethernet Segment using the ESI field, and when the remote ES fails and the PE receives the "mass withdrawal" message associated with the failed ES per [RFC7432], it can quickly update its BGP list of available remote entries to invalidate all VPWS service tunnels sharing the ESI field and achieve fast convergence for multi-homing scenarios. Even if fast convergence were not needed, there would still be a need for signaling each AC failure (via its corresponding VPWS service tunnel) associated with the failed ES, so that the BGP path list for each of them gets updated accordingly and the packets are sent to backup PE (in case of single- active multi-homing) or to other PEs in the redundancy group (in case of all-active multi-homing). In absence of updating the BGP path list, the traffic for that VPWS service tunnel will be black‑holed.¶
When a single VPWS service tunnel carries multiple ACs across various
Ethernet Segments (physical interfaces) without signaling the ACs via
EVPN BGP to remote PE devices, those remote PE devices lack the
information to associate the received Ethernet Segment with these
ACs or with their local ACs. They also lack the association between
the VPWS service tunnel (e.g., EVPN service label) and the far-end
ACs. This means that while the remote PEs can associate their local
ACs with the VPWS service tunnel, they cannot make similar associations
for the far-end ACs.
Consequently, in case of a connectivity failure to the ES, the
remote PEs are unable to redirect traffic via another multi-homing
PE to that ES. In other words, even if an ES failure is signaled via
EVPN to the remote PE devices, they cannot effectively respond because
they do not know the relationship between the remote ES, the
remote ACs, and the VPWS service tunnel.¶
To address this issue when multiplexing a large number of ACs onto a single VPWS service tunnel, two mechanisms have been developed: one to support VPWS services between two single-homed endpoints, and another to support VPWS services where one of the endpoints is multi-homed.¶
For single-homed endpoints, it is acceptable not to signal each AC
in BGP because, in the event of a connection failure to the ES, there
is no alternative path to that endpoint. However, the implication
of not signaling an AC failure is that the traffic destined for
the failed AC is sent over the MPLS/IP core and then discarded at
the destination PE, thereby potentially wasting network resources.
This waste of network resources during a connection failure may
be transient, as it can be detected and prevented at the application
layer in certain cases. Section 3.2 outlines a solution for such
single-homing VPWS services.¶
For VPWS services where one of the endpoints is multi-homed, there are two options:¶
1) to signal each AC via BGP, allowing the path list to be updated upon a failure affecting those ACs. This solution is described in Section 3.3 and is referred to as the VLAN-signaled flexible cross-connect service.¶
2) to bundle several ACs on an ES together per destination endpoint (e.g., ES, MAC-VRF, etc.) and associate such a bundle with a single VPWS service tunnel. This approach is similar to the VLAN-bundle service interface described in [RFC8214]. This solution is described in Section 3.2.1.¶
This section outlines a solution for providing a new VPWS service between two PE devices where a large number of ACs (such as VLANs) that span across multiple Ethernet Segments (physical interfaces) on each PE are multiplexed onto a single P2P EVPN service tunnel. Since the multiplexing involves several physical interfaces, there can be overlapping VLAN IDs across these interfaces. In such cases, the VLAN IDs (VIDs) must be translated into unique VIDs to prevent collisions. Furthermore, if the number of VLANs being multiplexed onto a single VPWS service tunnel exceeds 4095, then a single tag to double tag translation must be performed. This translation of VIDs into unique VIDs (either single or double) is referred to as "VID normalization".¶
When a single normalized VID is used, the lower 12 bits of the Ethernet tag field in EVPN routes MUST be set to that VID. When a double normalized VID is used, the lower 12 bits of the Ethernet tag field MUST be set to the inner VID, while the higher 12 bits are set to the outer VID. As stated in [RFC8214], 12-bit and 24-bit VPWS service instance identifiers representing normalized VIDs MUST be right-aligned.¶
Since there is only a single EVPN VPWS service tunnel associated with many normalized VIDs (either single or double) across multiple physical interfaces, an MPLS lookup at the disposition PE is no longer sufficient to forward the packet to the correct egress endpoint or interface. Therefore, in addition to an EVPN label lookup corresponding to the VPWS service tunnel, a VID lookup (either single or double) is also required. At the disposition PE, the EVPN label lookup identifies a VID-VRF, and the lookup of the normalized VID(s) within that table identifies the appropriate egress endpoint or interface. The tag manipulation (translation from normalized VID(s) to the local VID) SHOULD be performed either as part of the VID table lookup or at the egress interface itself.¶
Since the VID lookup (single or double) needs to be performed at the disposition PE, VID normalization MUST be completed prior to MPLS encapsulation on the ingress PE. This requires that both the imposition and disposition PE devices be capable of VLAN tag manipulation, such as rewriting (single or double), addition, or deletion (single or double) at their endpoints (e.g., their ESs, MAC-VRFs, IP-VRFs, etc.). Operators should be informed of potential trade-offs from a performance standpoint, compared to typical PW processing.¶
In [RFC8214], a unique value identifying the service is signaled in the context of each PE's EVI. The 32-bit Ethernet Tag ID field MUST be set to this VPWS service instance identifier value. Translation at an ASBR is needed if re-advertising to another AS affects uniqueness.¶
For FXC, this same Ethernet Tag ID field value is an identifier which may represent:¶
VLAN-Bundle : a unique value for a group of VLANs ;¶
VLAN-Aware Bundle : a unique value for individual VLANs, and is considered same as the normalised VID.¶
Both the VPWS service instance identifier and normalised VID are carried in the Ethernet Tag ID field of the Ethernet A-D per EVI route. For FXC, in the case of a 12-bit ID the VPWS service instance identifier is the same as the single-tag normalised VID and will be the same on both VPWS service endpoints. Similarly in the case of a 24-bit ID, the VPWS service instance identifier is the same as the double-tag normalised VID.¶
In this mode of operation, many ACs across several Ethernet Segments are multiplexed into a single EVPN VPWS service tunnel represented by a single VPWS service ID. This is the default mode of operation for FXC and the participating PEs do not need to signal the VLANs (normalized VIDs) in EVPN BGP.¶
Regarding the data-plane aspects of this solution, both imposition and disposition Provider Edge (PE) devices MUST be aware of the VLANs as the imposition PE performs VID normalization and the disposition PE carries out VID lookup and translation. There SHOULD ideally be a single point-to-point (P2P) EVPN VPWS service tunnel between a pair of PEs for a specific set of Attachment Circuits (ACs).¶
As previously mentioned, because the EVPN VPWS service tunnel is employed to multiplex ACs across various Ethernet Segments (ESs) or physical interfaces, the EVPN label alone is not sufficient for accurate forwarding of the received packets over the MPLS/IP network to egress interfaces. Therefore, normalized VID lookup is REQUIRED in the disposition direction to forward packets to their proper egress end-points; the EVPN label lookup identifies a VID-VRF, and a subsequent normalized VID lookup in that table identifies the egress interface.¶
In this solution, for each PE, the single-homing ACs represented by their normalized VIDs are associated with a single VPWS service instance within a specific EVI. The generated EVPN route is an Ethernet A-D per EVI route with and ESI of 0, and Ethernet Tag field set to the VPWS service instance ID, and the MPLS label field set to a dynamically generated EVPN service label representing the EVPN VPWS service tunnel. This route is sent with a Route Target (RT) that represents the EVI, which can be auto‑generated from the EVI according to Section 5.1.2.1 of [RFC8365]. Additionally, this route is sent with the EVPN Layer-2 Extended Community defined in Section 3.1 of [RFC8214] with two new flags (outlined in Section 4) that indicate: 1) this VPWS service tunnel is for the default Flexible Cross-Connect, and 2) the normalized VID type (single versus double). The receiving PE uses these new flags for a consistency check and MAY generate an alarm if it detects inconsistencies, but it will not disrupt the VPWS service.¶
It should be noted that in this mode of operation, a single Ethernet A-D per EVI route is transmitted upon the configuration of the first Attachment Circuit (AC) with the normalized VID. As additional ACs are configured and associated with this EVPN VPWS service tunnel, the PE does not advertise any additional EVPN BGP routes and only associates locally these ACs with the pre-established VPWS service tunnel.¶
The default FXC mode can also be used for multi-homing. In this mode, a group of normalized VIDs representing ACs on a single Ethernet Segment, all destined to a single endpoint, are multiplexed into a single EVPN VPWS service tunnel which is identified by a unique VPWS service ID. When employing the default FXC mode for multi-homing, rather than using a single EVPN VPWS service tunnel there may be multiple service tunnels per pair of PEs. Specifically, there is one tunnel for each group of VIDs per pair of PEs, and there can be many such groups between a pair of PEs, resulting in numerous EVPN service tunnels.¶
In this mode of operation, similar to the default FXC mode described in Section 3.2, many normalized VIDs representing ACs across several Ethernet Segments/interfaces are multiplexed into a single EVPN VPWS service tunnel. However, this single tunnel is represented by multiple VPWS service IDs (one per normalized VID) and these normalized VIDs are signaled using EVPN BGP.¶
In this solution, on each Provider Edge (PE), the multi-homing ACs represented by their normalized VIDs are configured with a single EVI. There is no need to configure a separate VPWS service instance ID in here, as it corresponds to the normalized VID. For each normalized VID on each Ethernet Segment, the PE generates an Ethernet A-D per EVI route where the ESI field represents the ES ID, the Ethernet Tag field is set to the normalized VID, and the MPLS label field is set to a dynamically generated EVPN label representing the P2P EVPN service tunnel. This label is the same for all ACs multiplexed into a single EVPN VPWS service tunnel. This route is sent with a Route Target (RT) representing the EVI. As before, this RT can be auto-generated from the EVI per section Section 5.1.2.1 of [RFC8365]. Additionally, this route includes the EVPN Layer-2 Extended Community defined in Section 3.1 of [RFC8214] with two new flags (outlined in Section 4) that indicate: 1) this VPWS service tunnel is for VLAN-signaled Flexible Cross-Connect, and 2) the normalized VID type (single versus double). The receiving PE uses these new flags for a consistency check and may generate an alarm if it detects inconsistency, but it will not disrupt the VPWS service.¶
It should be noted that in this mode of operation, the PE sends a single Ethernet A-D per EVI route for each AC that is configured. Each normalized VID that is configured per ES results in generation of an Ethernet A-D per EVI.¶
This mode of operation enabled automatic cross-checking of normalized VIDs used for Ethernet Virtual Private Line (EVPL) services because these VIDs are signaled in EVPN BGP. For instance, if the same normalized VID is configured on three PE devices (instead of two) for the same EVI, then when a PE receives the second remote Ethernet A-D per EVI route, it generates an error message unless the two Ethernet A-D per EVI routes include the same ESI. Such cross-checking is not feasible in the default FXC mode because the normalized VIDs are not signaled.¶
When cross-connection occurs between two ACs belonging to two multi-homed Ethernet Segments on the same set of multi-homing PEs, the forwarding between the two ACs must be performed locally during normal operation (e.g., in absence of a local link failure). This means that traffic between the two ACs MUST be locally switched within the PE.¶
In terms of control plane processing, this means that when the receiving PE processes an Ethernet A-D per EVI route whose ESI is a local ESI, the PE does not modify its forwarding state based on the received route. This approach ensures that local switching takes precedence over forwarding via the MPLS/IP network. This method of prioritizing locally switched traffic aligns with the baseline EVPN principles described in [RFC7432], where locally switched preference is specified for MAC/IP routes.¶
In such scenarios, the Ethernet A-D per EVI route should be advertised with the MPLS label either associated with the destination Attachment Circuit or with the destination Ethernet Segment in order to avoid any ambiguity in forwarding. In other words, the MPLS label cannot represent the same VID-VRF outlined in Section 3.3, as the same normalized VID can be reachable via two Ethernet Segments. In the case of using an MPLS label per destination AC, this approach can also be applied to VLAN-based VPWS or VLAN-bundle VPWS services as per [RFC8214].¶
The V field defined in Section 4 is OPTIONAL. However, if transmitted, its value may indicate an error condition that could lead to operational issues. In such cases, merely notifying the operator of an error is insufficient; the VPWS service tunnel must not be established.¶
If both endpoints of a VPWS tunnel are signaling a matching Normalised VID in the control plane, but one is operating in single-tag mode and the other in double-tag mode, the signaling of the V-bit facilitates the detection and prevention of this tunnel's instantiation.¶
If single VID normalization is signaled in the Ethernet Tag ID field (12 bits) yet dataplane is operating based on double tags, the VID normalization applies only to outer tag. Conversely, if double VID normalization is signaled in the Ethernet Tag ID field (24 bits), VID normalization applies to both the inner and outer tags.¶
This draft uses the EVPN Layer-2 attribute extended community as defined in [RFC8214] with two additional flags incorporated into this Extended Community (EC) as detailed below. This EC is sent with Ethernet A-D per EVI route per Section 3, and SHOULD be sent for both Single-Active and All-Active redundancy modes.¶
+-------------------------------------------+ | Type (0x06) / Sub-type (0x04) (2 octets) | +-------------------------------------------+ | Control Flags (2 octets) | +-------------------------------------------+ | L2 MTU (2 octets) | +-------------------------------------------+ | Reserved (2 octets) | +-------------------------------------------+ 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MBZ | V | M |-|C|P|B| (MBZ = MUST Be Zero) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
The following bits in the Control Flags are defined; the remaining bits MUST be set to zero when sending and MUST be ignored when receiving this community.¶
Name Meaning --------------------------------------------------------------- B,P,C per definition in [RFC8214] - reserved for Flow-label M 00 mode of operation as defined in [RFC8214] 01 VLAN-Signaled FXC 10 Default FXC V 00 operating per [RFC8214] 01 single-VID normalization 10 double-VID normalization¶
The M and V fields are OPTIONAL. The M field is ignored at reception for forwarding purposes and is used for error notifications.¶
Two examples will be used as an example to analyze the failure scenarios.¶
The first scenario is a default Flexible Xconnect with Multi-Homing solution and it is depicted in Figure 1. In this case, VID Normalization is performed and a single Ethernet A-D per EVI route is sent for the bundle of ACs on an ES. That is, PE1 will advertise two Ethernet A-D per EVI routes: the first one will identify the ACs on port p1's ES and the second one will identify the AC2 in port p2's ES. Similarly, PE2 will advertise two Ethernet A-D per EVI routes.¶
The second scenario, depicted in Figure 2, illustrates the VLAN‑signaled FXC mode with Multi-Homing. In this example:¶
CE1 is connected to PE1 and PE2 via (port,VID)=(p1,1) and (p3,3), respectively. CE1's VIDs are normalized to value 1 on both PEs, and CE1 is cross-connected to CE3's VID 1 at the remote end.¶
CE2 is connected to PE1 and PE2 via ports p2 and p4 respectively:¶
In this scenario, PE1 and PE2 advertise an Ethernet A-D per EVI route for each normalized VID (values 1, 2 and 3). However, only two VPWS Service Tunnels are required: VPWS Service Tunnel 1 (sv.T1) between PE1's FXC service and PE3's FXC, and VPWS Service Tunnel 2 (sv.T2) between PE2's FXC and PE3's FXC.¶
The failure detection of an EVPN VPWS service can be performed via OAM mechanisms such as VCCV-BFD and upon such failure detection, the switch over procedure to the backup S-PE is the same as the one described above.¶
In the event of an AC failure, the VLAN-Signaled and default FXC modes exhibit distinct behaviors:¶
Default FXC (Figure 1): in the default mode, a VLAN or AC failure is not signaled. Consequently, in case of an AC failure such as VID1 on CE2, there is nothing to prevent PE3 from directing traffic from CE4 to PE1, leading to a potential black hole. Application layer Operations, Administration, and Maintenance (OAM) may be utilized if per-VLAN fault propagation is necessary in this scenario.¶
VLAN-Signaled FXC (Figure 2): in the case of a VLAN or AC failure such as VID1 on CE2, triggers the withdrawal of the Ethernet A-D per EVI route for the corresponding Normalized VID, specifically Ethernet-Tag 2. Upon receiving the route withdrawal, PE3 will remove PE1 from its outgoing path list for traffic originating from CE4.¶
In the event of a PE port failure, the failure will be signaled, and the other PE will assume forwarding in both scenarios:¶
Default FXC (Figure 1): In the case of a port failure, such as p2, the route for Service Tunnel 2 (sv.T2) will be withdrawn. Upon receiving the fault notification, PE3 will remove PE1 from its path list for traffic originating from CE4 and CE5.¶
VLAN-Signaled FXC (Figure 2): A port failure, such as p2, triggers the withdrawal of the Ethernet A-D per EVI routes for Normalized VIDs 2 and 3, along with the withdrawal of the Ethernet A-D per ES route for p2's ES. Upon receiving the fault notification, PE3 will remove PE1 from its path list for the traffic originating from CE4 and CE5.¶
In the case of PE node failure, the operation is similar to the steps described above, albeit that EVPN route withdrawals are performed by the Route Reflector instead of the PE.¶
Since this document describes a muxing capability which leverages EVPN-VPWS signaling, no additional functionality beyond the muxing service is added and thus no additional security considerations are needed beyond what is already specified in [RFC8214].¶
This document requests allocation of bits 8-11 in the "EVPN Layer 2 Attributes Control Flags" registry with names M and V:¶
M Signaling mode of operation (2 bits) V VLAN-ID normalization (2 bits)¶
In addition to the authors listed on the front page, the following co-authors have also contributed substantially to this document:¶
Wen Lin
Juniper Networks¶
EMail: wlin@juniper.net¶
Luc Andre Burdet
Cisco¶
EMail: lburdet@cisco.com¶