Dynamic Network Adjustments for Cloud Service Scaling

Internet-Draft	Resource Abstraction	November 2024
Dunbar, et al.	Expires 8 May 2025	[Page]

Abstract

This document specifies a framework for dynamically adjusting network configurations in response to cloud service scaling events. As cloud services grow, increase traffic, or add resources, automatically adapting network configurations can improve performance and enable greater interoperability. Manual network adjustments are often slow, error-prone, and inadequate for the rapid changes of cloud services. The proposed framework, along with the associated YANG models, facilitates seamless interoperability among network controllers and equipment from various vendors, which is an essential requirement for Telecom Cloud providers operating in multi-vendor environments.¶

1. Introduction

Cloud services have become increasingly dynamic, requiring real time adjustments to meet fluctuating workloads and user demand. As these services scale, whether due to increased traffic, new resource allocations, or expanded service delivery, network configurations must adapt accordingly to ensure performance. For example, scaling a cloud service may require adjustments in bandwidth, modifications to load balancers, or updates to access control lists (ACLs).¶

Traditionally, coordinating these network adjustment with cloud service scalling has involved manual intervention or reliance on proprietary solutions, both of which are slow, error-prone, and inefficient. With the growing complexity of multi-vendor cloud orchestration systems and network controllers deployed in Telecom Cloud environment, there is a pressing need for a standardized, interoperable approach to managing dynamic network changes.¶

The primary objective of this document is to propose a framework for automating network adjustments in response to cloud service scaling. By integrating cloud orchestration systems with network management via standardized YANG models, this framework ensures that network resources can scale flexibly and adapt swiftly to the evolving changes of cloud services. Furthermore, it enables seamless interoperability across controllers and equipment from different vendors, making it particularly valuable for Telecom Cloud providers operating in multi-vendor environments.¶

3. Problem Statement

As cloud services continue to scale dynamically, network infrastructure must adjust in real time to support the changing demands of these services. In many Telcom Cloud Operations, this coordination between cloud service scaling and network reconfiguration is either manual or dependent on proprietary solutions, leading to several key challenges:¶

- Lack of coordination between cloud service orchestration and network management leads to inefficiencies and delays in adapting network configurations to cloud service changes.¶

- Inconsistent and proprietary solutions limit the ability to manage and automate network resources across different vendors and multi-cloud environments.¶

- Delayed network adaptation to cloud service scaling can result in performance issues, traffic congestion, and service disruptions.¶

- Operational complexity increases in multi-cloud and multi-vendor environments due to the lack of standardized, vendor-agnostic frameworks.¶

- No standardized framework exists for automating network adjustments in response to cloud service scaling, limiting the ability to implement seamless, real-time network changes.¶

4. Framework for Dynamic Network Adjustments

The Dynamic Network Adjustments Framework provides a vendor agnostic, standardized method for automating network changes in response to cloud service scaling events. The framework integrates cloud orchestration systems with network controllers, enabling seamless management of both cloud and network resources.¶

4.1. Core Components

The framework consists of three core components:¶

- Unified Resource Model (URM): The URM abstracts network and cloud resources, providing a unified interface for managing them across multi-vendor environments.¶

- Cloud Orchestration Systems: Platforms like Kubernetes or OpenStack detect changes in cloud services (e.g., increased traffic or resource scaling) and communicate these triggers to the network controllers.¶

- Network Controllers: Software-Defined Networking (SDN) controllers or network orchestrators use YANG models to dynamically adjust network resources (e.g., bandwidth, load balancers, ACLs) based on cloud service needs.¶

4.2. Work Flow

- Cloud Service Trigger: A cloud orchestration system detects a service scaling event, such as increased traffic or the addition of new compute resources.¶

- Trigger to Network Controller: The cloud orchestrator communicates with the network controller, requesting adjustments to network configurations (e.g., increasing bandwidth or reconfiguring the load balancer).¶

- Network Adjustment via YANG Models: The network controller invokes YANG models, such as the Dynamic-Bandwidth YANG module or Dynamic-Load-Balancer YANG module, to modify the network infrastructure in real time.¶

- Feedback Loop: Monitoring systems provide feedback on the effectiveness of the adjustments, ensuring that performance objectives are met.¶

5. Network Changes Triggered by Cloud Services

This section demonstrates how a cloud service expansion trigger (e.g., scaling a service, increasing traffic, or adding new resources) can interact with a network YANG model to dynamically modify network configurations, such as increasing bandwidth, updating load balancer settings, or adjusting an access control list (ACL).¶

5.1. Dynamic-Bandwidth YANG module

The Dynamic-Bandwidth YANG module, extending the ietf-network-topology YANG model [RFC8345], can be invoked by orchestration systems or controllers in response to events or triggers, such as cloud services scalling up that generates increased traffic.¶

  module dynamic-bandwidth {
  namespace "urn:ietf:params:xml:ns:yang:dynamic-bandwidth";
  prefix dbw;

  import ietf-network-topology {
    prefix nt;
  }

  organization "IETF";
  contact "IETF Routing Area";
  description
        "YANG model for dynamically updating bandwidth.";

  revision "2024-10-18" {
    description "Initial version.";
  }

  augment "/nt:networks/nt:network/nt:link" {
    description
      "Augment the network topology YANG model to update
          the bandwidth dynamically.";

    leaf requested-bandwidth {
      type uint64;
      description "Requested bandwidth in Mbps.";
    }
   }
 }

For instance, when a cloud orchestration system detects increased traffic, it can dynamically request an increase in bandwidth to 1000 Mbps (1 Gbps) on network link link-123. The cloud orchestration system can use the following JSON input to initiate the requested changes.¶

{
  "nt:networks": {
    "nt:network": [
      {
        "network-id": "cloud-network-1",
        "nt:link": [
          {
            "link-id": "link-123",
            "source-device": "device-1",
            "destination-device": "device-2",
            "requested-bandwidth": 1000  // 1000 Mbps bandwidth
          }
        ]
      }
    ]
  }
}

5.2. Dynamic-Load-Balancer YANG Model

When a cloud service scales, adjustments to the load balancer configuration may be required, such as adding new backend servers or modifying the load distribution method. The Dynamic-Load-Balancer YANG module defined here can be invoked by automated orchestration systems or controllers in response to specific events or triggers, enabling the load balancer to adapt dynamically to changing service demands.¶

module: dynamic-load-balancer
  +--rw load-balancer
     +--rw balancer* [balancer-id]
        +--rw balancer-id         string
        +--rw algorithm           enumeration
        |     +-- round-robin           "Distributes traffic evenly across all servers in rotation."
        |     +-- least-connections     "Sends requests to the server with the fewest active connections."
        |     +-- ip-hash               "Distributes requests based on a hash of the client's IP address."
        |     +-- ml-optimized          "Routes long-lived ML flows through highest bandwidth paths."
        +--rw backend-servers* [server-id]
           +--rw server-id         string
           +--rw ip-address        inet:ipv4-address
           +--rw port              uint16

For example, Telecom Cloud Controller can use this JSON code to trigger the automatic load balancer change for a newly deployed ML service that requires the ML-optimized load balancing algorithm:¶

{
  "load-balancer": {
    "balancer": [
      {
        "balancer-id": "ml-service-lb",
        "algorithm": "ml-optimized",
        "backend-servers": [
          {
            "server-id": "ml-backend-1",
            "ip-address": "192.168.10.1",
            "port": 8080
          },
          {
            "server-id": "ml-backend-2",
            "ip-address": "192.168.10.2",
            "port": 8080
          },
          {
            "server-id": "ml-backend-3",
            "ip-address": "192.168.10.3",
            "port": 8080
          }
        ]
      }
    ]
  }
}

5.3. Dynamic-ACL YANG Model

In response to cloud service expansion or evolving security requirements, an Access Control List (ACL) might need modifications, such as adding or removing IP addresses allowed to access particular services. This section specifies a YANG model augmenting the existing ACL model defined in [draft-ietf-netmod-acl-extensions]¶

module dynamic-acl {
  yang-version 1.1;
  namespace "urn:ietf:params:xml:ns:yang:dynamic-acl";
  prefix dacl;

  import ietf-inet-types {
    prefix inet;
  }

  import ietf-acl {
    prefix acl;
  }

  organization "IETF";
  contact "IETF Routing Area";
  description
    "YANG model for dynamically updating ACLs based on cloud service scaling requirements.";

  revision "2024-10-18" {
    description "Initial version.";
  }

  augment "/acl:acls/acl:acl/acl:aces/acl:ace" {
    description "Augments the ACL model to dynamically manage ACEs based on cloud scaling triggers.";

    leaf cloud-service-trigger {
      type string;
      description "Identifier for the cloud service trigger that necessitates this ACL change.";
    }

    leaf priority {
      type uint32;
      description "Priority level of this ACE in the context of dynamic updates.";
    }
  }
}

Key Enhancement:¶

- The model now supports dynamic updates triggered by cloud service scaling, allowing automated ACL modifications without manual intervention.¶

- Each Access Control Entry (ACE) can be associated with a specific cloud service event, aiding in tracking and auditing changes.¶

- The priority leaf allows for granular control over rule application order, especially useful in dynamic environments where rules may frequently change based on service demands.¶

For example, the JSON code below dynamically adds a new rule in response to a cloud service scaling event. The "ml-service-scaling" field indicates that this ACL rule is applied due to the scaling of a machine learning service, while the priority of 10 ensures that this ACE is processed before lower-priority rules.¶

{
  "acl:acls": {
    "acl": [
      {
        "name": "dynamic-acl-001",
        "aces": {
          "ace": [
            {
              "name": "dynamic-rule-1",
              "actions": {
                "forwarding": "permit"
              },
              "matches": {
                "ipv4": {
                  "source-ipv4-network": "192.168.1.0/24",
                  "destination-ipv4-network": "10.0.0.0/24"
                },
                "protocol": "tcp",
                "source-port": 22
              },
              "cloud-service-trigger": "ml-service-scaling",
              "priority": 10
            }
          ]
        }
      }
    ]
  }
}

6. Security Considerations

Security is a critical aspect when automating network adjustments in response to cloud service scaling. Several key areas should be addressed:¶

- Authentication and Authorization:¶

Use mutual authentication methods such as TLS certificates to verify the identities of both the cloud orchestrator and the network controller before any configuration commands are accepted.¶

OAuth or API Key-Based Access: For REST API-based communications, secure token-based authentication (e.g., OAuth 2.0) or unique API keys can be employed to validate requests from legitimate sources.¶

- Data Integrity:¶

Use TLS to encrypt communication channels, protecting the integrity of the transmitted data.¶

Employ checksums or hash functions on critical configuration messages to detect any tampering or unintended modifications during transit.¶

- Monitoring and Auditing:¶

Maintain detailed logs of all configuration changes initiated by cloud scaling events, including timestamps, source entities, and specific parameters modified.¶

Conduct periodic audits of the authorization policies, access logs, and configuration adjustments to ensure compliance with security policies and to detect any anomalies.¶

Dynamic Network Adjustments for Cloud Service Scaling

Abstract

Status of This Memo

Copyright Notice

Table of Contents

1. Introduction

2. Requirements Language

3. Problem Statement

4. Framework for Dynamic Network Adjustments

4.1. Core Components

4.2. Work Flow

5. Network Changes Triggered by Cloud Services

5.1. Dynamic-Bandwidth YANG module

5.2. Dynamic-Load-Balancer YANG Model

5.3. Dynamic-ACL YANG Model

6. Security Considerations

7. IANA Considerations

8. Normative References

Acknowledgements

Contributors

Authors' Addresses