For Ai Agents, Bite-size 'small Language Models' Could Make More Sense

SERVERS

gettyimages-1384462937-cropped — Weiquan Lin/Getty

If you're looking to build AI agents into your workflows, don't waste the valuable compute power of large language models on these systems.

That's the opinion of a group of Nvidia researchers, who recently made the case for "small language models," or SLMs, noting that while LLMs have been the engines of generative AI up until now, they're probably overkill for supporting more focused AI agents. Instead, SLMs may present a smarter approach.

With a surge in agentic AI systems will come a host of applications that use language models to carry out a few specialized tasks over and over again, without much variation, the Nvidia team, led by Peter Belcak said in the report.

Also: Why scaling agentic AI is a marathon, not a sprint

SLMs "are sufficiently powerful, inherently more suitable, and necessarily more economical for many invocations in agentic systems," the report said. They could play an important role in the future of agentic AI.

In situations "where general-purpose conversational abilities are essential, heterogeneous agentic systems -- agents invoking multiple different models -- are the natural choice," the researchers continued.

SLMs could also be instrumental in lowering AI costs. Using LLMs for AI agents can be expensive, and it doesn't always match most use cases for the technology, functionally.

"Insisting on LLMs for all such tasks reflects a misallocation of computational resources -one that is economically inefficient and environmentally unsustainable at scale," the report said.

In many current instances, typical AI agents communicate with chosen LLM API endpoints by making requests to centralized cloud infrastructure that hosts these models, the report said. Such LLM API endpoints "are specifically designed to serve a large volume of diverse requests using one generalist LLM."

This LLM-based operational model is deeply ingrained. And there's a money angle at work, too. The report estimated a$63 billion market for LLM API and hosting cloud infrastructure.

Also: AI agents bring big risks and rewards for daring early adopters, says Forrester

"It is assumed that this operational model will remain the cornerstone of the industry without any substantial alterations, and that the large initial investment will deliver returns comparable to traditional software and internet solutions within three to four years," the report said.

As organizations roll out AI agents across a broad range of functions, they will recognize that LLMs are too much for these systems, said Virginia Dignum, professor of responsible AI at Umea University and chair of the ACM Technology Policy Council, in a separate recent discussion. In most cases, "the idea proposed as agentic AI consists of building an active interface on top of a large language model," she said.

There are issues with this view of agentic AI built on LLMs. First, they could be wasteful.

"LLMs are trained over huge amounts of data and computation to be able to deal with broad language issues. An agent ... is usually meant to deal with specific questions. You don't expect your realtor to discuss philosophy, or your travel agent to be able to produce art," she said. "I see a potential huge waste of data and compute to build such agents on top of LLMs."

Also: 4 ways to get your business ready for the agentic AI revolution

Multi-agent collaboration is the most effective route to getting results from agentic AI. "What is key is applications based on collaboration between many smaller agents that use less data and training, but can achieve more by combining with other agents," Dignum explained. "A distributed approach -less computational heavy, more inclusive and more able to address the difference between contexts and cultures."

The Nvidia team offered the following suggestions for deploying SLMs:

Consider costs:"Organizations should consider adopting small language models for agentic applications to reduce latency, energy consumption, and infrastructure costs, particularly in scenarios where real-time or on-device inference is required," they stated.
Consider modular design:"Leverage SLMs for routine, narrow tasks and reserving LLMs for more complex reasoning-thereby improving efficiency and maintainability."
Consider specialization:"Take advantage of the agility of SLMs by fine-tuning them for specific tasks, enabling faster iteration cycles and easier adaptation to evolving use cases and requirements."

SLMs offer a number of advantages over using LLMs for agents, the Nvidia team explained. Such advantages include lower latency, reduced memory, and computational requirements. Plus, they can be cheaper while still getting the job done.

Get the morning's top stories in your inbox each day with our Tech Today newsletter.

Featured

Best Buy will give you a free Sony 65-inch 4K TV right now
Why Denmark is dumping Microsoft Office and Windows for LibreOffice and Linux
You can use T-Mobile's satellite service for free on any carrier now - here's how
The cloud broke Thursday and it'll happen again - how to protect your business before then

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVERS

HOT NEWS

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

For AI agents, bite-size 'small language models' could make more sense

Featured

Hot Tags : Innovation

Ordering Guide

Resources

About Us

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVERS

HOT NEWS

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

For AI agents, bite-size 'small language models' could make more sense

Featured

Hot Tags : Innovation

Ordering Guide

Resources

About Us

Huawei CloudEngine S5731‑S48P4X Datasheet