Content
DPI Design for High Availability in Small Sites
by Bojan Radulović, CCNP, CCDP, CCDA and CCSI at NIL Data Communications
Introduction
As Deep Packet Inspection (DPI) has become more entrenched in multiple areas of the network, it has also become a strategic investment for almost all Tier-1 and Tier-2 service providers (SPs). This article will focus primarily on the application of DPI in the mobile space, as the need to enforce policies there has the greatest impact, but the various design aspects discussed here may be applied in other environments as well.
With the explosion of data-based services and the amount of data growing beyond the physical capabilities of the network, understanding and controlling subscriber-based traffic has become increasingly important.
FIGURE 1:
Estimates of compound annual growth rate (CAGR) for mobile traffic over the five-year period 2009–2014.
Figure 1 displays traffic growth predictions in the mobile space, where the biggest increase will be seen in mobile data and mobile video. Service providers will be challenged to offer the required network capacities to accommodate this increasing traffic. Using a DPI device enables SP to understand the subscriber traffic profile and offer differentiated services to gain more revenue and market share.
DPI devices are capable of providing several levels of intelligence, depending on the deployment approach:
Traffic analysis and business intelligence
Implement traffic monitoring, analysis and reporting
Determine subscriber and application usage patterns
Capacity control and fair-usage policies
Manage bandwidth-intensive applications through packet-flow optimization techniques
Implement fair-usage policies for fair allocation of network resources
Revenue-generating services
Implement tiered services using volume- and time-based quotas
Implement service self-care
Innovate other differentiated services such as parental controls, content filtering, turbo buttons, allowance-based services, prioritized application services and pay-as-you-go services
Service providers will be able to implement options 1 and 2 from the list above quickly, due to ease of implementation and the low level of required interoperability with back-end systems. Revenue-generating services (option 3) require integration with several other components to achieve the desired functionality. Following are the required components:
Policy and Charging Rules Function (PCRF)
Radius
Billing
Charging
Subscriber database
Position of DPI in Mobile Core
DPI deployments generally position the DPI device as close as possible to the subscriber, where analysis and traffic control of each subscriber will be easiest to implement. With this design, the DPI device will be able to provide subscriber awareness and full DPI functionality, avoiding any split-flow situation, to classify the traffic correctly.
Note :
The term split flow refers to a situation in which one part of the flow (initiated or returning) does not pass through the DPI, and thus the DPI does not have full visibility. This setup can result in incorrect classification, billing and charging.
FIGURE 2 :
DPI in the mobile core.
The DPI is commonly deployed after the access gateway, where the subscriber traffic becomes de-encapsulated and enters the IP cloud (Figure 2).
The remainder of this article focuses on different ways to position the DPI device after the access gateway, discussing the benefits and downsides of each design.
Inline and Receive-Only Configurations
One common practice for initial deployments is to use a receive-only configuration, in which the traffic is simply monitored. In this design, the DPI is usually deployed for only a short time, for the sake of statistical analysis of traffic.
Configuration of a switched-port analyzer (SPAN) is required, using either the interface or a virtual LAN (VLAN) as the basis for the source:
Configuration printout 1 :
Sample configuration for directly connected switches with either VLAN or interface as a source.
switch(config)#monitor session 1 source interface fastethernet 0/1
switch(config)#monitor session 1 destination interface fastethernet 0/2
switch(config)#monitor session 1 source vlan 5
switch(config)#monitor session 1 destination interface fastethernet 0/10
Note :
Limitations may exist, depending on the platform and Cisco IOS version where the SPAN functionality is applied.
FIGURE 3 :
DPI deployed in a receive-only (SPAN) setup, with visibility of only subscriber-side flows.
The benefit of using the DPI in a receive-only fashion is that no maintenance window is required for the installation and operation of the DPI. Statistics can be gathered through passive monitoring of subscriber flows, and the statistics are used for traffic trending and creation of new business models. The deployment has full or partial visibility of subscriber traffic (see Figure 3). This deployment offers no ability to control/steer subscriber traffic.
FIGURE 4 :
DPI deployed in a receive-only (optical splitter) setup, with full subscriber flow visibility. Only the receive (Rx) fiber is connected to the DPI.
Similar functionality can be achieved with the use of an external optical splitter (as depicted in Figure 4) for full visibility of subscriber-initiated and returning flows. The same limitations apply to the optical splitter design as to the SPAN deployment.
Since a maintenance window requirement for a DPI installation cannot be avoided, many customers might choose to implement the DPI inline, gaining full visibility of subscriber traffic, with the additional ability to control and influence traffic.
Note:
When a DPI device is positioned inline, it introduces a certain amount of delay into exiting communications. A dedicated DPI device will handle the traffic classification in hardware, introducing only a minimal amount of delay even at gigabit or 10-gigabit speeds.
FIGURE 5 :
Inline deployment of the DPI.
Figure 5 shows a basic inline deployment, where the DPI is inserted inline with the subscriber traffic. This design offers full visibility and control over the subscriber traffic for subscriber-initiated and returning traffic. However, as depicted in Figure 5, this design offers no resilience to hardware- or software-based issues, so a DPI failure will create a network-down situation and disrupt business flow. The following sections explore several design options that offer more resilience, which can help to avoid business disruption.
Spanning Tree Configurations
For additional resilience spanning tree protocol (STP) may be used as a method of a loop-free topology, which also ensures a redundant link to provide automatic failover if the active link fails (see Figure 6).
FIGURE 6:
Spanning tree design for automatic failover.
Depending on the flavor of spanning tree, varying convergence times might be achieved; a rapid spanning tree (RST) or multiple spanning tree (MST) design is recommended. In the spanning tree design, the backup offers a path for subscriber traffic, avoiding a network-down situation in case of DPI failure. However, this design still causes service degradation. Traffic bypassing the DPI will not be controlled, meaning that any subscriber policies (bandwidth caps, quota caps, redirections, filtering etc.) will not take effect.
Note:
The spanning tree implementation should be tuned so that blocked ports are not on the link where the DPI is positioned.
Etherchannel Configurations
The use of a link-bonding technology, Etherchannel with Link Aggregation Control Protocol (LACP), avoids the use of a spanning tree for automatic failover. The benefit of LACP bundle negotiation is that it allows for dynamic failover to “hot standby” configured interfaces. The standby ports can failover in less than 250 milliseconds; however, the failover does not exceed 2 seconds, which is a great improvement compared to any spanning tree convergence periods.
FIGURE 7 :
Etherchannel with LACP bundling design for automatic failover.
The LACP configuration requires definition of the number of maximum active links in the bundle, as well as definition of the priority for the hot standby ports. This requirement allows for defining which of the ports will be used as a standby link in case of active link failures:
Configuration printout 2 :
Sample configuration for directly connected link bonding with LACP.
Switch1#(config)interface GigabitEthernet0/1
Switch1(config-if)channel-protocol lacp
Switch1(config-if)channel-group 1 mode active
!
Switch1#(config)interface GigabitEthernet0/2
Switch1(config-if)lacp port-priority 65535
Switch1(config-if)channel-protocol lacp
Switch1(config-if)channel-group 1 mode active
!
Switch1(config)interface Port-channel1
Switch1(config-if)load-interval 30
Switch1(config-if)lacp max-bundle 1
Switch2#(config)interface GigabitEthernet0/1
Switch2(config-if)channel-protocol lacp
Switch2(config-if)channel-group 1 mode active
!
Switch2#(config)interface GigabitEthernet0/2
Switch2(config-if)lacp port-priority 65535
Switch2(config-if)channel-protocol lacp
Switch2(config-if)channel-group 1 mode active
!
Switch2(config)interface Port-channel1
Switch2(config-if)load-interval 30
Switch2(config-if)lacp max-bundle 1
The configuration needs to be copied onto both switches:
Configuration printout 3 :
Verification of correct bundle negotiation. Ports marked (H) will be used as hot standby ports in cases where the number of active links falls below the max-bundle value.
Switch1#sh etherchannel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator
M - not in use, minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port
Number of channel-groups in use: 1
Number of aggregators: 1
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
1 Po1(SU) LACP Gi0/1(P) Gi0/2(H)
Switch2#sh etherchannel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator
M - not in use, minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port
Number of channel-groups in use: 1
Number of aggregators: 1
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
1 Po1(SU) LACP Gi0/1(P) Gi0/2(H)
For the LACP to recognize a link failure, the DPI device needs to reflect the L1 failure for the opposing switch:
Configuration printout 4 :
Configuration of the Service Control Engine (SCE) for link failure reflection.
SCE#>configure
SCE(config)#>interface LineCard 0
SCE(config if)#>link failure-reflection
Note:
In instances where the LACP would be used to load-balance traffic across multiple DPI devices, the connecting switches require the ability to load-balance based on source and destination ports. This design guarantees that the subscriber flows will always flow through the same DPI device, for accurate classification and enforcement of policies.
Optical Bypass Configurations
In situations where multiple switch ports are not available, an optical bypass (Figure 8) is a viable solution, guaranteeing to avoid network downtime in case of hardware failure. The optical bypass provides protection by enabling automatic link preservation in two scenarios:
Complete loss of power of the DPI
Implementation of a maintenance window
FIGURE 8 :
Optical bypass design.
Configuration printout 5 :
Configuration of the SCE for bypass.
SCE#>configure
SCE(config)#>interface LineCard 0
SCE(config if)#>connection-mode inline on-failure bypass
Configuration printout 6 :
Verification of the failure-mode configuration in the SCE.
SCE#>show interface LineCard 0 connection-mode
slot 0 connection mode
Connection mode is inline
slot failure mode is bypass
Redundancy status is standalone
Optical bypass is usually a standard solution for high availability; most DPI vendors offer both external and internal modules. The optical bypass has awareness of some of the DPI system-level functions in order to decide when a failover is required and can be triggered manually.
Caution:
Optical bypass failovers are susceptible to hardware failures of certain components, where the status of the system is fully functional but one separate component fails; for example, the small form-factor pluggable. In that situation, the DPI might not trigger the failover.
Cascade Configurations
In all of the design scenarios discussed previously, the designs support high availability for subscriber traffic; however, in each case a DPI failure will create a service downtime in which no subscriber policies will be applied.
FIGURE 9 :
Cascaded design for stateful replication. Traffic can enter and leave through both paths. Cross-connections are used for state replication and forwarding of traffic toward the active node.
A cascade design provides a stateful replication of all subscriber-based information between two DPI devices for high availability of services (Figure 9). Active/standby architecture guarantees that at least one of the cascaded devices is inspecting the traffic. The cascade model provides the same level of performance as a single device.
Note :
The configuration must be identical in the two cascaded devices. To ensure correct negotiation, establish the cascade relationship with empty configurations.
Configuration printout 7 :
Configuration of cascaded SCEs.
SCE01#>config
SCE01(config if)#> interface LineCard 0
SCE01(config if)#> shutdown
SCE01(config if)#> connection-mode inline-cascade physically-connected-links link-1 priority secondary on-failure bypass
SCE01(config if)#>no shutdown
SCE02#>config
SCE02(config)#> interface LineCard 0
SCE02(config if)#> shutdown
SCE02(config if)#> connection-mode inline-cascade physically-connected-links link-0 priority primary on-failure bypass
SCE02(config if)#>no shutdown
Configuration printout 8 :
Verification of successful cascade negotiation in which one of the SCEs is active.
SCE01#>show interface LineCard 0 connection-mode
slot 0 connection mode
Connection mode is inline-cascade
slot 0 sce-id is 0
slot 0 is primary
slot 0 is connected to peer
slot failure mode is bypass
Redundancy status is active
SCE02#>show interface LineCard 0 connection-mode
slot 0 connection mode
Connection mode is inline-cascade
slot 0 sce-id is 1
slot 0 is secondary
slot 0 is connected to peer
slot failure mode is bypass
Redundancy status is standby
The traffic will always flow toward the currently active DPI for inspection and implementation of policies. Both devices keep state information about each of the subscriber flows (state, quota, charging records etc.), ensuring that none of the data is lost in case of failover to the standby device.
FIGURE 10 :
Traffic flow in a cascaded design.
Figure 10 depicts the path that the subscriber traffic is taking. If the subscriber traffic hits the active SCE, the SCE inspects the traffic and forwards it toward the network side (blue numbers 1 and 2 in Figure 9). In a situation where the subscriber traffic hits the standby device first, the SCE will forward the traffic through the cross-connecting links to the active SCE for inspection (orange numbers 1 and 2). After the active SCE inspects and applies policies, the traffic will be returned to the standby SCE and forwarded toward the network side (orange 3 and 4).
This design also solves any split-flow situations, since both of the SCEs perform stateful replication. All subscriber information is updated on both the active and standby devices, regardless of the traffic direction, entering or leaving the cascaded system.
Note:
The cascade design is not scalable; therefore, it may be used only in situations where high availability of services is required. However, the capabilities of the paired DPIs in this design do not transcend the capacity and performance of a single DPI.
Conclusion
This article has explored several options for a small site DPI design. Of the configurations compared here, only the cascade model offers high availability of services; in all the other designs, only network connectivity was preserved. Commonly deployed options in these models include either optical bypass or Etherchannel configuration with LACP for dynamic failover to the hot standby port (if multiple free ports are available on connecting switches). Spanning tree designs are rarely used, due to the possibility of loop situations, A link bonding solution with LACP offers superior failover times and is mostly preferred.
