Intel® Xeon® Processors Set the Standard for Vector Search Benchmark Performance

In real-world vector search performance tests, Intel® Xeon® server architectures outperform AMD EPYC processors when running two commonly used vector search frameworks.

Publié dans Non classé | Commentaires fermés sur Intel® Xeon® Processors Set the Standard for Vector Search Benchmark Performance

From Gold Rush to Factory: How to Think About TCO for Enterprise AI

Less Gold Rush and more Boring Factory – The evolving AI mindset.

Publié dans Non classé | Commentaires fermés sur From Gold Rush to Factory: How to Think About TCO for Enterprise AI

A Practical Guide to CPU-Optimized LLM Deployment on Intel® Xeon® 6 Processors on AWS.

Deploying large language models no longer requires expensive GPUs or complex infrastructure. In this guide, we show how Intel® Xeon® 6 processors paired with vLLM deliver high‑throughput, production‑ready LLM inference entirely on CPUs. Learn how to launch a scalable, OpenAI‑compatible endpoint on AWS Marketplace – complete with NUMA‑aware parallelism, BF16 acceleration, chunked prefill, and optimized KV‑cache performance – so you can run enterprise‑grade LLM workloads at a fraction of traditional GPU costs.
Publié dans Non classé | Commentaires fermés sur A Practical Guide to CPU-Optimized LLM Deployment on Intel® Xeon® 6 Processors on AWS.

Bringing Polish AI to Life: Running Bielik LLMs Natively on Intel® Gaudi® 3 Accelerators

From community curiosity to real-world inference – showing how local language models run with day-zero Intel hardware support.

Publié dans Non classé | Commentaires fermés sur Bringing Polish AI to Life: Running Bielik LLMs Natively on Intel® Gaudi® 3 Accelerators

Optimizing SLMs on Intel® Xeon® Processors: A llama.cpp Performance Study

In this post, we’ll dicuss how to run responsive, CPU-only applications using a quantized SLM in the GPT-Generated Unified Format (GGUF). 

Publié dans Non classé | Commentaires fermés sur Optimizing SLMs on Intel® Xeon® Processors: A llama.cpp Performance Study

Intel® Xeon® 6 Processors: The Smart Total Cost of Ownership Choice

The latest Intel® Xeon® 6 processors deliver performance advantages across key enterprise workloads, enabling companies to deploy fewer servers and still deliver a similar aggregate performance level compared to AMD EPYC solutions

Publié dans Non classé | Commentaires fermés sur Intel® Xeon® 6 Processors: The Smart Total Cost of Ownership Choice

Next-Gen AI Inference: Intel® Xeon® Processors Power Vision, NLP, and Recommender Workloads

Intel® Xeon® processors can deliver a CPU-first platform built for modern AI workloads without added complexity or overhead.

Publié dans Non classé | Commentaires fermés sur Next-Gen AI Inference: Intel® Xeon® Processors Power Vision, NLP, and Recommender Workloads

Document Summarization: Transforming Enterprise Content with Intel® AI for Enterprise RAG

Transform enterprise documents into insights with Document Summarization, optimized for Intel® Xeon® and Intel® Gaudi® with automated NUMA-aware scheduling.

Publié dans Non classé | Commentaires fermés sur Document Summarization: Transforming Enterprise Content with Intel® AI for Enterprise RAG

AutoRound Meets SGLang: Enabling Quantized Model Inference with AutoRound

We are thrilled to announce an official collaboration between SGLang and AutoRound, enabling low-bit quantization for efficient LLM inference.

Publié dans Non classé | Commentaires fermés sur AutoRound Meets SGLang: Enabling Quantized Model Inference with AutoRound

In-production AI Optimization Guide for Xeon: Search and Recommendation Use Case

In this guide, you’ll learn multiple aspects of optimizing the Search and Recommendation model deployed in Production using Intel Xeon CPU servers. 

Publié dans Non classé | Commentaires fermés sur In-production AI Optimization Guide for Xeon: Search and Recommendation Use Case