Multi-Agent Travel Itinerary System on AWS Bedrock: A 3-Agent Architecture with Llama 4, Qwen 3.7 Max, and a Fine-Tuned Scheduler

By Sam Qikaka

Category: Agents & Architecture

Learn how a mid-size tour operator deployed a three-agent multi-agent travel itinerary system on AWS Bedrock using Llama 4 for intent parsing, Qwen 3.7 Max for POI recommendations, and a fine-tuned scheduling agent for real-time availability checks. The pilot achieved 55% faster itinerary planning and 22% higher booking conversion.

Last updated: May 23, 2026 (UTC) Travel companies are increasingly adopting multi-agent architectures to overcome the limitations of traditional monolithic or rule-based itinerary builders. As of May 2026, the landscape is rapidly evolving, with large language models (LLMs) such as Meta's Llama 4 and Alibaba Cloud's Qwen 3.7 Max reaching a maturity level suitable for production deployment on managed services like AWS Bedrock. Recent academic research, including HiMAP-Travel (arXiv:2603.04750) and Vaiage (arXiv:2505.10922), validates the effectiveness of multi-agent coordination for complex travel planning tasks. This article details a vendor-neutral three-agent architecture, derived from a pilot program with a mid-size tour operator, which resulted in a 55% reduction in itinerary planning time and a 22% increase in booking conversion rates. Why Travel Companies Are Moving to Multi-Agent

Architectures Single-agent or rule-based itinerary builders often falter when faced with the combinatorial complexity inherent in real-world travel planning. This complexity arises from numerous constraints such as budget, traveler preferences, and timing, coupled with the need to account for real-time availability and dynamic pricing. A 2025 IDC report on AI adoption in the travel sector indicated that 68% of tour operators found existing solutions either required extensive manual input or produced generic itineraries that failed to convert bookings. Multi-agent systems address these challenges by breaking down the problem into specialized sub-tasks. Each sub-task is then handled by a dedicated agent, allowing for the utilization of the most appropriate model for its specific role. Overview of the Three-Agent System on AWS Bedrock The implemented architecture comprises three distinct ag

ents, orchestrated through AWS Bedrock Agents: Agent 1: User Intent Parsing — Leverages Llama 4 to extract structured constraints from natural language user input. Agent 2: Point-of-Interest (POI) Recommendation — Utilizes Qwen 3.7 Max to suggest diverse and contextually relevant places and activities. Agent 3: Fine-Tuned Scheduling — Employs a scheduling LLM, fine-tuned on historical booking data, to manage real-time availability checks via external APIs. The system operates on an event-driven data flow: Agent 1 outputs a structured intent schema, which Agent 2 then enriches with POI candidates. Finally, Agent 3 optimizes the sequence of activities and confirms availability. All agents communicate through Amazon Bedrock's integrated orchestration layer, with results stored in a shared state container. Agent 1 – User Intent Parsing with Llama 4 Meta's Llama 4, an open-weight model, demon

strates exceptional capabilities in natural language understanding and extraction. Within this system, it is prompted to parse user queries into a structured JSON schema that includes: and (per person or total) (e.g., luxury, adventure, cultural) (e.g., "must visit the Louvre") and The prompting strategy incorporates a few-shot template populated with examples from previous bookings. Both the 8B and 70B variants of Llama 4 were evaluated; the 70B version achieved an intent extraction accuracy of 94% on a held-out test set comprising 500 real customer queries. The model card published by Meta in 2026 confirms its strong performance on structured output tasks. Agent 2 – Point-of-Interest Recommendation with Qwen 3.7 Max Alibaba Cloud's Qwen 3.7 Max, their latest flagship model, excels in recommendation tasks due to its extensive knowledge base and proficiency in instruction-following for s

tructured list generation. This agent processes the parsed intent and queries a combined knowledge graph—integrating Wikidata, OpenStreetMap, and the tour operator's curated inventory—to produce a ranked list of POIs. Key features leveraged include: Diversity-aware ranking : Ensures recommendations are not overly concentrated in a single area or category. Seasonal and event awareness : Qwen 3.7 Max's training data incorporates temporal patterns and event information. Budget compliance : Each POI is tagged with estimated cost ranges to align with user budgets. In offline evaluations, Qwen 3.7 Max achieved a relevance precision@10 of 91%, surpassing the previous baseline model (GPT-4o) which scored 84%. The model release notes from Alibaba Cloud (April 2026) highlight its strengths in retrieval-augmented generation and structured output generation. Agent 3 – Fine-Tuned Scheduling with Real

-Time Availability The scheduling agent is an LLM fine-tuned on a smaller Llama family model (Mistral-7B) and is designed to integrate real-time availability checks. The fine-tuning process utilized 50,000 historical booking records from the tour operator, employing LoRA (Low-Rank Adaptation) to enh