Multi-Agent Retail Inventory Optimization: A 10-Store Pilot That Cut Stockouts 24% and Costs 18%

By Sam Qikaka

Category: Agents & Architecture

Discover how a three-agent architecture on AWS Bedrock—using Qwen 3.8 Max for demand forecasting and Llama 5 for supplier negotiation simulation—delivered measurable ROI in a real retail pilot.

Introduction: Why Retail Needs Multi-Agent AI Now Supply chains remain the Achilles' heel of retail operations. Despite advances in forecasting and ERP systems, stockouts and overstock continue to bleed margins—the former costing lost sales and customer trust, the latter tying up capital in dead inventory. Traditional single-agent AI models often fall short because they tackle only one facet of the problem: a forecasting model may predict demand, but it cannot negotiate with suppliers or balance trade-offs in real time. Enter multi-agent architectures. By decomposing inventory decisions into specialized agents—each with its own model, data, and objective—and orchestrating their outputs, enterprises can achieve a holistic optimization that no single AI can match. The pilot described here, run on AWS Bedrock, demonstrates that this approach is not just theoretical: it delivers double-digit

improvements in both stockout reduction and carrying cost savings. Architecture Overview: The Three-Agent Model for Retail Inventory Optimization At the heart of the pilot lies a three-agent architecture retail pilot that teams up three specialized agents working in concert on a shared orchestrator. The architecture is built on AWS Bedrock's AgentCore, which provides native multi-agent collaboration capabilities. Each agent is assigned a distinct role: Demand Forecasting Agent (powered by Qwen 3.8 Max) – predicts near-term demand for each SKU at each store. Supplier Negotiation Agent (powered by Llama 5) – simulates supplier offers based on cost, lead time, and minimum order quantities. Orchestration Agent – receives forecasts and simulated offers, then decides replenishment orders and allocations. Agents communicate via structured messages (JSON payloads) through Bedrock's AgentCore ru

ntime. The orchestration agent acts as the decision-making hub, resolving conflicts between optimal inventory positions and supplier constraints. This three-agent architecture is designed to be modifiable: retailers can swap models or add agents (e.g., for logistics or pricing) without redesigning the core system. Agent 1: Demand Forecasting with Qwen 3.8 Max The demand forecasting agent Qwen 3.8 Max is the first line of defense against stockouts. Qwen 3.8 Max, a large language model available on AWS Bedrock, is fine-tuned to process historical sales data, seasonality, promotions, and external factors like weather events. In the pilot, the agent ingested daily point-of-sale data from the 10 stores and generated 14-day rolling forecasts per SKU. Why Qwen 3.8 Max? Its efficiency in handling multi-modal inputs (tabular data plus text) and its strong performance on time-series reasoning made

it ideal for this role. The model's low inference latency allowed the agent to re-forecast every 6 hours, keeping pace with real-time sales fluctuations. This contributed directly to the 24% reduction in stockouts: the forecasting agent could flag potential shortages before they happened, giving the orchestration agent time to adjust orders. Agent 2: Supplier Negotiation Simulation with Llama 5 The supplier negotiation simulation Llama 5 agent tackles the cost side. Llama 5, also available on AWS Bedrock, is used to simulate negotiation outcomes with multiple suppliers. Given a forecast demand and current inventory, the agent generates a range of possible supplier offers—varying price breaks, lead times, and minimum order quantities—and ranks them by total cost (including carrying cost). This simulation does not replace actual negotiations; rather, it gives the orchestration agent a pro

babilistic view of what is possible. For example, if the forecasting agent flags a potential shortage of 200 units of a fast-moving SKU, the negotiation agent might output: “Supplier A: 200 units, 2-day lead time, $0.50 per unit discount if order 300. Supplier B: 150 units, 5-day lead time, no discount.” The orchestration agent can then weigh the trade-off between a larger order (lower per-unit cost but higher carrying cost) versus a smaller, faster order. The pilot found that this simulation alone reduced procurement costs by an average of 6% beyond the baseline, contributing to the overall 18% reduction in inventory carrying costs. Agent 3: Orchestration for Replenishment Decisions The orchestration agent replenishment decisions agent ties everything together. It receives the demand forecast from Agent 1 and the simulated supplier offers from Agent 2, along with current inventory level

s and carrying cost parameters. Its objective is to minimize a combined function of stockout probability and total inventory cost. This agent does not use a separate model; it leverages Bedrock's built-in reasoning capabilities (using a lightweight model like Claude Haiku or Amazon Nova Micro) to ru