Practical Deployment of LLMs for Network Traffic Classification – Part 1

Additional contributing authors:
Rui Li, Vishakh Nair, Mrittika Ganguli

Executive Summary

The integration of Generative AI and Large Language Models (LLMs) into network security and operational management presents transformative opportunities for enhancing application identification and traffic classification. Traditional methods that rely on literal matching and regex-based software flows face limitations in handling complex network tasks, particularly in deep packet inspection (DPI) and encrypted traffic analysis. This research proposes a hybrid architecture that leverages multiple LLM backends, such as GPT-2 and ModernBERT, to facilitate dynamic, context-aware, and adaptive traffic classification systems. While earlier works like TrafficGPT demonstrated the potential of LLMs in encrypted traffic classification, our research advances these capabilities by integrating batch processing and optimized edge inference techniques—directly addressing the scalability and performance constraints of prior approaches. We conducted comprehensive evaluations, including workload characterization and hardware-specific optimizations on Intel® Xeon® processors and Intel® Arc™ A770 Graphics. These efforts establish a practical foundation for deploying LLM-based systems at scale, extending the frontier of traditional network security.

Introduction

The rapid advancement of generative AI has opened new possibilities for enhancing network security. However, developing accurate and cost-effective large language model (LLM) solutions in this domain remains a significant challenge. Network security tasks—particularly deep packet inspection (DPI)—require intelligent systems capable of handling encrypted and increasingly complex traffic. As of 2023, more than 90% of internet traffic was encrypted [1], severely limiting the effectiveness of traditional DPI techniques that rely on payload visibility when decryption is not feasible. Furthermore, a report by Zscaler found that approximately 87.2% of all cyberthreats blocked by enterprises between October 2023 and September 2024 were delivered over encrypted (TLS/SSL) channels, representing a nearly 10% year-over-year increase [2]. This highlights the growing importance of accurately encrypted traffic classification.

While enterprise environments may deploy TLS inspection and decrypt-then-analyze methods, broader internet-scale traffic analysis typically relies on metadata-only approaches. Traditional techniques use literal matching and regular expressions across the packet capture, protocol parsing, and session reconstruction pipeline. While current data plane software and pattern matching engines support this process, these methods struggle to scale and adapt to evolving encrypted traffic patterns—especially in scenarios where payload decryption is not available.

Figure 1: Traditional Approach

Recent breakthroughs in Deep Learning (DL)—particularly Convolutional Neural Networks (CNN) and Transformer-based architectures—have demonstrated remarkable success in extracting sophisticated protocol and session features directly from raw network data and metadata. These intelligent models dramatically reduce dependence on brittle static rules, delivering unprecedented flexibility and classification performance. However, the true innovation lies in Large Language Models (LLMs), which unlock transformative potential that fundamentally reshapes network security capabilities. Their ability to understand natural language enables dynamic, context-aware interpretation of network data, supporting adaptive policy generation and anomaly detection. LLMs enhance traffic analytics and classification, delivering improvements in challenging open-set scenarios and critical metadata-only analysis environments where traditional approaches fail.

Although the LLM ultimately performs advanced pattern recognition, its true value lies in its ability to understand context — how various traffic features interact over time. For example, it can distinguish that a specific sequence of packet sizes, when paired with certain timing patterns, corresponds to one application, while the same sizes with different timing imply a different one. Traditional approaches often analyze features like size, timing, and frequency in isolation. In contrast, our LLM captures the full temporal and relational structure of traffic behavior, considering how current activity relates to prior and subsequent events within the same session. This context-aware pattern analysis, coupled with the model’s ability to continuously adapt through dynamic retraining, leads to significantly more accurate classification of encrypted traffic and more effective identification of novel or suspicious patterns that conventional rule-based systems would overlook.

Despite the potential, LLM deployment has historically faced significant cost and performance barriers. Real-time classification over high-speed links like 100 Gbps demands low latency inferencing, creating challenges with existing models that require significantly more compute than traditional methods [3,4].

To address performance and latency limitations—and to unlock the full potential of LLMs for network security—we focused our research on several key areas: quantization, batch pipeline improvements, Intel® Arc™ A770 GPU integration, and support for multiple LLM backends (e.g., GPT-2, ModernBERT). These efforts aim to deliver leading performance in encrypted traffic analysis scenarios.

Transforming Application Identification in Network Security with Deep Learning

Integrating Deep Learning into application identification is reshaping traditional approaches in network security. While packet capture remains largely unchanged—still relying on high-performance tools such as DPDK, AF_XDP, or Linux kernel libraries like libpcap—the protocol parsing and session reconstruction phases can be significantly improved with Transformer-based architectures.

Traditionally, systems like Snort and Suricata have used rule-based parsers and regex engines for protocol detection and application identification. DL models can now replace these components by inferring session and protocol metadata directly from raw packet sequences, eliminating the need for handcrafted rules and static pattern matching. Deep learning embeddings can take the place of traditional string matching, while Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs) dynamically learn application signatures, offering far greater adaptability.

Application identification engines can be fundamentally rearchitected using CNNs, RNNs, and Transformers that classify applications from raw packet data or flow-level features. However, before fully embracing this transition, it is critical to validate the accuracy of deep learning models, particularly for encrypted traffic classification, which is notoriously challenging.

This is where our optimization techniques come into play. By retraining LLMs on a dataset of over 102,000 network flows, our models achieve classification accuracy exceeding 95% across all tested models:

ModernBERT: 96.49%QWen2.5: 95.64%GPT-2: 95.41%LLaMA3: 95.38%

These results demonstrate that a wide range of LLM’s can be effectively adapted for network traffic analysis. Furthermore, through a combination of optimization techniques, we achieved substantial throughput improvements.

To support our model training and evaluation, we used two diverse and complementary datasets. The first is the USTC-TFC2016 dataset, which includes 3.71 GB of traffic traces spanning 10 malware families and 10 categories of benign applications such as peer-to-peer, email, gaming, and social networking. This dataset combines real malware traffic with professionally simulated normal traffic, offering a balanced foundation for both malicious and benign classification. The second dataset, sourced from IEEE Dataport, contains encrypted traffic collected from six widely used instant messaging applications—Teams, Discord, Messenger, Signal, Telegram, and WhatsApp—on Android devices. It also includes non-IMA traffic such as web browsing and video streaming. All traffic was captured in pcap format, with corresponding flow-level feature sets extracted for model input.

This initial blog serves as an introduction to encrypted traffic analysis. In upcoming posts, we’ll dive deeper into implementation details and performance optimizations—covering topics such as quantization, batch processing techniques, and efficient utilization of Intel GPUs for encrypted traffic classification. We’ll also provide code samples and scripts to help you apply these techniques and accelerate your own encrypted traffic analysis workflows.

References

[1] Google. HTTPS encryption on the web. Google Transparency Report. Available at: https://transparencyreport.google.com/https/overview (Accessed June 26, 2025).

[2] Zscaler. Zscaler ThreatLabz 2024 Encrypted Attacks Report. Available at: https://www.zscaler.com/blogs/security-research/threatlabz-report-threats-delivered-over-encrypted-channels

[3] Cisco. (2023). Annual Security Report. https://www.cisco.com/c/en/us/products/security/security-reports.html

[4] Cloudflare. (2023). Cloudflare Radar. https://radar.cloudflare.com

Practical Deployment of LLMs for Network Traffic Classification – Part 1

Additional contributing authors:
Rui Li, Vishakh Nair, Mrittika Ganguli

Executive Summary

Introduction

Transforming Application Identification in Network Security with Deep Learning

Articles récents

Neural networks news

Intel NN News

Archives

Catégories

Practical Deployment of LLMs for Network Traffic Classification – Part 1

Additional contributing authors: Rui Li, Vishakh Nair, Mrittika Ganguli

Executive Summary

Introduction

Transforming Application Identification in Network Security with Deep Learning

Articles récents

Neural networks news

Intel NN News

Additional contributing authors:
Rui Li, Vishakh Nair, Mrittika Ganguli