Reduce Downtime Up To 50% by Utilizing AI-Ready RAS Features of Intel® Xeon® Processors

As generative and agentic AI use cases proliferate across nearly every industry, improving the reliability, availability, and serviceability (RAS) of AI clusters is becoming increasingly important. Intel® Xeon® 6 processors offer an impressive set of RAS features that can help improve the stability and performance of AI computing clusters. Intel’s collaboration with Internet technology company ByteDance demonstrated that using the RAS features of Intel Xeon CPUs reduced server downtime by up to 50%

Publié dans Non classé | Commentaires fermés sur Reduce Downtime Up To 50% by Utilizing AI-Ready RAS Features of Intel® Xeon® Processors

How to Fine-Tune an LLM on Intel® GPUs With Unsloth

Fine-tuning an LLM doesn’t have to require massive infrastructure. With Unsloth now supporting Intel® GPUs, developers can efficiently customize models like Llama 3 and Qwen across Intel Core Ultra–based AI PCs, Intel Arc graphics, and the Intel Data Center GPU Max Series.

This blog walks through key techniques like SFT, PEFT, and RLHF—and shows how Intel-optimized libraries such as oneDNN and Triton accelerate training while reducing memory use. Build faster, smarter, and more personalized AI—all within the Intel ecosystem.

Publié dans Non classé | Commentaires fermés sur How to Fine-Tune an LLM on Intel® GPUs With Unsloth

Intel® Xeon® Processors Set the Standard for Vector Search Benchmark Performance

In real-world vector search performance tests, Intel® Xeon® server architectures outperform AMD EPYC processors when running two commonly used vector search frameworks.

Publié dans Non classé | Commentaires fermés sur Intel® Xeon® Processors Set the Standard for Vector Search Benchmark Performance

From Gold Rush to Factory: How to Think About TCO for Enterprise AI

Less Gold Rush and more Boring Factory – The evolving AI mindset.

Publié dans Non classé | Commentaires fermés sur From Gold Rush to Factory: How to Think About TCO for Enterprise AI

A Practical Guide to CPU-Optimized LLM Deployment on Intel® Xeon® 6 Processors on AWS.

Deploying large language models no longer requires expensive GPUs or complex infrastructure. In this guide, we show how Intel® Xeon® 6 processors paired with vLLM deliver high‑throughput, production‑ready LLM inference entirely on CPUs. Learn how to launch a scalable, OpenAI‑compatible endpoint on AWS Marketplace – complete with NUMA‑aware parallelism, BF16 acceleration, chunked prefill, and optimized KV‑cache performance – so you can run enterprise‑grade LLM workloads at a fraction of traditional GPU costs.
Publié dans Non classé | Commentaires fermés sur A Practical Guide to CPU-Optimized LLM Deployment on Intel® Xeon® 6 Processors on AWS.

Bringing Polish AI to Life: Running Bielik LLMs Natively on Intel® Gaudi® 3 Accelerators

From community curiosity to real-world inference – showing how local language models run with day-zero Intel hardware support.

Publié dans Non classé | Commentaires fermés sur Bringing Polish AI to Life: Running Bielik LLMs Natively on Intel® Gaudi® 3 Accelerators

Optimizing SLMs on Intel® Xeon® Processors: A llama.cpp Performance Study

In this post, we’ll dicuss how to run responsive, CPU-only applications using a quantized SLM in the GPT-Generated Unified Format (GGUF). 

Publié dans Non classé | Commentaires fermés sur Optimizing SLMs on Intel® Xeon® Processors: A llama.cpp Performance Study

Intel® Xeon® 6 Processors: The Smart Total Cost of Ownership Choice

The latest Intel® Xeon® 6 processors deliver performance advantages across key enterprise workloads, enabling companies to deploy fewer servers and still deliver a similar aggregate performance level compared to AMD EPYC solutions

Publié dans Non classé | Commentaires fermés sur Intel® Xeon® 6 Processors: The Smart Total Cost of Ownership Choice

Next-Gen AI Inference: Intel® Xeon® Processors Power Vision, NLP, and Recommender Workloads

Intel® Xeon® processors can deliver a CPU-first platform built for modern AI workloads without added complexity or overhead.

Publié dans Non classé | Commentaires fermés sur Next-Gen AI Inference: Intel® Xeon® Processors Power Vision, NLP, and Recommender Workloads

Document Summarization: Transforming Enterprise Content with Intel® AI for Enterprise RAG

Transform enterprise documents into insights with Document Summarization, optimized for Intel® Xeon® and Intel® Gaudi® with automated NUMA-aware scheduling.

Publié dans Non classé | Commentaires fermés sur Document Summarization: Transforming Enterprise Content with Intel® AI for Enterprise RAG