Internet-Draft | 5GXRM Metadata UDP Option | October 2024 |
Jiang & Liu | Expires 24 April 2025 | [Page] |
Extended Reality & multi-modality communication, or XRM, is a type of advanced service that has been studied and standardized in 3GPP. The service features at achieving high data rate, high reliability and low latency. The multiple streams of an XRM service use IP sessions to transport media contents with the provisioning of advanced QoS settings. The XRM Metadata or PDU Set QoS marking is used to differentiate the PDU Set requirements to the 5GS. RTP header extension (HE), as defined by 3GPP, can be used to transport XRM Metadata for un-encrypted media streams, while the encrypted XRM streams post challenges for UPFs to extract the Metadata. This draft proposes to use the IETF UDP Option extension, by defining a new SAFE type, to help enhance the carry & transport of encrypted XRM Metadata.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 24 April 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Extended Reality & multi-modality communication, or XRM, is a type of advanced service that has been studied and standardized in 3GPP [TS.23.501]. With the objective of achieving high data rate, high reliability and low latency, it features multi-modal interactions among a group of service entities that might be (geographically) distributed at the mobile network edges. The streams of an XRM service have components of data from the modalities like video, audio, ambient-sensor and haptic detection. The benefits of seamlessly integrating multiple types of streams sourced via multiple inputs make it widely applicable in fields, like AR/VR, telepresence, gaming, education, etc.¶
XRM services consolidate the inputs from more than one source and disseminate information to multiple destinations, which indicates an XRM application could be comprised of the input data from different kinds of devices/sensors or the output data to different types of destinations. This scheme possesses intrinsic advantages of providing services that would be complementary to each other, or even bearing progressive add-on gains, so that redundant delivery and information accuracy would be achieved effectively.¶
Thanks to different requirements of 5G XRM media steams featuring coordinated throughput, low latency and high reliability, XRM service faces challenges on various aspects, e.g. characteristics of generated data across modalities, accurate multi-modality data synchronization, QoS differentiation, large volume of small packets, and packet-size variation, etc. [_5G.TACMM]. XRM services use (multiple) IP sessions to carry & transport data streams. With an IP session corresponding to one-modal stream, the coordinated transmission among multi-modal flows (or streams) needs to be warranted. The client(s) of different types of data of one application may be located at either one destination (e.g., a UE), or multiple destinations (e.g. having VR glasses, gloves, and more).¶
5G uses the term PDU or Packet Data Unit to represent packets exchanged among UEs (or end devices), RAN (or radio access network), 5GC (or 5G core network) and external AppServers in DNNs (Data Networks). PDU is somewhat similar to APU or application data unit. A QoS Flow is the finest granularity of QoS differentiation in a PDU Session, suggesting that all PDU packets belonging to a QoS flow be treated according to the same QoS requirements [TS.23.501].¶
5G XRM has defined a new term, namely the PDU Set, specifying a group of packets carrying the payload of e.g. a frame, a video slice/tile, etc. Packets (i.e., PDUs) belonging to the same PDU Set are decoded/handled as a whole, meaning a frame/video slice may be decoded at the receiver only if all or certain amount of the packets carrying the frame/video slice are successfully delivered. For example, a frame within a GOP (Group of Pictures) can only be decoded by the client in case all frames on which that frame depends are successfully received. Hence the groups of packets within a PDU Set have inherent dependency on each other in media layer. Without considering the inter-dependancy among the packets within a PDU set, 5GS may perform resource scheduling with low efficiency. For example, upon network congestion, a 5GS may randomly drop packet(s) but deliver other packets of the same PDU set that are deemed useless to the client and thus waste radio resources [TS.23.501].¶
Figure 1 shows the 5G XRM transport model. The network function(NF) UPF is similar to an IP-domain router, which sends/receives IP packets (or PDUs in PDU sets) off the N6 interface to/from EAS'es (or App Servers) in Data Domains. Normally, a UPF would use the IP 5-tuple for packet classfication and prioritization (i.e., using PDRs in the 3GPP 5G term [TS.23.501]). However, a 5-tuple cannot expose thoroughly PDU and PDU Set related QoS information of XRM media streams. As such, this draft will elucidate how to expose the information for effectively addressing the data processing of XRM streams at the UPF in 5GS.¶
XRM Media streams, potentially consisting of diversified framing, slicing and encoding of video images, contain additional parameters like the relative importance among different PDUs that are generated from differrent types of frames. E.g., the I, P, B frames might determine tiered priorities among them. [TS.23.501] standardizes the enhancement to the 5GS QoS framework to support advanced settings of XRM PDU Sets, which are called 'XRM Metadata' in this draft. XRM Metadata supports differentiated QoS provisioning. For example, a Metadata component, the PDU Set Importance or PSI could be downward-assigned to less-critical PDU Set(s) so as to de-prioritize the resource consumption of less important PDU Set(s). In the case of another Metadata component, the PDU sequence number is assigned to a PDU for better ordered processing of PDUs at receviers. Figure 2 demonstrates the PDU Set QoS information and how they would be generated and transported by App servers (EAS'es) in DNN.¶
The 3GPP document TS 26.522 [TS.26.522] has standardized how RTP would be extended to transport XRM media and Metadata information between an edge server (or EAS) and an end device. As shown in the Figure 2, the RTP header extension (HE) for PDU Set marking can be performed by an application server (or EAS, RTP sender, etc.) for downstream XRM media traffic. TS 26.522 [TS.26.522] has defined both the one-byte and the two-byte RTP HE formats [RFC8285], to accommodate the XRM metadata. Please see the Appendix A for the definitions of RTP HE formats.¶
According to the latest progress in 3GPP, the PDU Set marking for XRM metadata can be classified into three categories, namely:¶
From the 3GPP perspective, the semantics of the RTP HE fields for PDU Set marking can be defined as follows:¶
Please see the Appendix A on how the above PDU Set markings are encoded in an RTP extended header.¶
There are other critical factors impacting the transport of encoded video packets. One of them is the ubiquitously-existential encryption of packet (header and) payload. For example, in the Figure 2, an EAS (or RTP sender) sources & transports XRM stream data, and associated Metadata. If both video contents and Metadata in a packet are encrypted at the source (i.e., the UDP source), then the Metadata or so-called PDU Set QoS information will remain hidden from intermediary routing entities, i.e., the UPF as in the Figure 2, until the same packet reaches the UDP destination in a UE. That the encryption preventing an (intermediary) UPF from extracting XRM Metadata brings in a new dimension of challenge to the 5G XRM service.¶
The challenge revolving around the transport of encrypted XRM media leads to the exploration & adoption of several IETF mechanisms by 3GPP, with the focus on conveying critical XRM Metadata to UPFs. The 3GPP document [TR.23.700-70] has concluded to support three different schemes, namely the Media over QUIC (MoQT) [MediaOverQUIC], the QUIC-Aware Proxying HTTP [QUICawareProxying] and, finally, the UDP Option extension [transportUDPoption].¶
When we view the requirements of XRM payload and header encryption, along with the possible extensibility provided by [transportUDPoption], we believe adopting the UDP Option is a feasbile solution.¶
As shown in the Figure 3, the UDP option scheme is supported to carry encrypted & integrity-protected XRM metadata that is transported between the UPF and the EAS. Security keys for UDP Option are negotiated between UPFs and EAS'es via a 3GPP-defined procedure. XRM Metadata corresponds to PDU Set information as in Section 2.2. The UPF extracts the XRM Metadata from extended UDP options as defined in Figure 4.¶
The Figure 4 shows the UDP option bit settings for XRM PDU Set information (i.e., XRM Metadata) (please reference to Section 2.2 for more details). The 'Kind' value will be allocated from the range [10...126] after registering to IANA. This UDP Option is a SAFE option type.¶
Note: Total-len = Len_EM + Len_EO + Len_EN¶
The advantage of applying UDP Option helps remove adding RTP HE to accommodate the XRM Metadata as defined in [TS.26.522]. The RTP HE is actually more suitable for un-encrypted media streams (see Section 2). For un-encrypted DL PDUs (of a PDU-set) reaching a UPF, the UPF extracts the XRM metadata for processing (from un-encrypted PDUs). Comparably, for encrypted streams from EAS'es, the XRM metadata is carried in (extended) UDP Options, for which there is no need to use the (redundant) RTP HE any more.¶
Another advantage is that UDP is a layer-4 protocol and its header will normally not be processed by IP routers. Not only does this relieve the processing burden off IP transport devices, but also gives a clear demarcation of the transport & IP layer structure.¶
Some concerns are currently revolving around the extension of the UDP Option by arguing that UDP is a layer-4 transport protocol and its associated datagrams should be end-to-end processed, i.e., encapsulated at UDP sources and decapsulated at UDP destinations. The similar argument has been discussed in the Section 16 of [transportUDPoption]. As in Figure 2, we know the downlink IP packets enter into the 5GS via the UPF N6 interface from the IP domain (DNN) (right-side in the figure). The UPF functions to switch IP packets toward the UE (residing on the left side in the figure). Obviously, the UE is the genuine end receiver (of a UDP datagram). The UPF is only an intermediary node taking on IP functionalities, which is nothing different from a regular IP-domain node. Therefore, applying the UDP Option extension and having the intermediary (IP) node, e.g., a 5GS UPF, process UDP datagrams might be indeed a concern of violating the end-to-end transport structure.¶
Fortunately, there exist good arguments for the 5G XRM service to adopt the UDP Option extension. A 5GS is unique in that it is a composite system, as shown in the Figure 2. It can be considered holistically as a 'blackbox' joining the external IP domain. It bears the 'Opaque' property to the outside. The IP DNN does not know that a UE and its anchored UPF (in 5GS) are two seperate entities, nor does it care. Instead, it only cares to forward IP packets downstream to the 5GS (via the N6 interface). How the 5GS (i.e., the UPF) may process packets is out of the scope (of the IP domain). Because of 5GS' 'composite & transparent' characteristics, we believe that a 5GS (UPF) can be granted the capability to 'intelligently' break the IP-UDP demarcation rule by peeking at the (encrypted) XRM Metadata that is carried in UDP Option. To the external IP domain, this still observes the end-to-end transport rule.¶
Actually, there is already an I.D. discussing how to have end points explicitly distribute the encrpyted metadata to an intermediary network node [EncryptedMetaDataToNetworkNode]. As shown in the Figure 2, the UPF would be the node to use the metadata to assist in decrypting the media contents (and/or headers). Once the UPF gets all the detailed information, it can provision and enforce the QoS settings for the XRM streams [TS.23.501].¶
Further, the draft [transportUDPoption] also suggests clearly that the UDP Option is just a framework. Options might be defined even when all the details (along with any potential extension) are not yet complete. The use of such options can be described or supplemented in separate documents. This suggestion does bode well for the 5GS XRM service because our draft is exactly conforming to the tenet of the UDP Option framework.¶
There is generally no security concern as long as the XRM Metadata is encrypted and integrity-protected during the transport in UDP Option extension. An AUTH UDP Option can be added to allow the UPF to detect any modification to the Metadata.¶
IANA request to assign a new Kind from the SAFE range [10...126] of the UDP option registry as per [transportUDPoption]¶
Kind Length Meaning ----------------------------------------------------- TBD Varied XRM Metadata (XRMeta)¶
The 3GPP document [TS.26.522] has defined two formats of RTP Header Extension for the marking of PDU Sets and XRM Metadata, namely the one-byte RTP HE and the two-byte RTP HE. The following two figures correspond to the two formats, respectively. Please note that, as of now, both HE extensions in [TS.26.522] conform to the standards of the Rel-18 in [TS.23.501].¶