Deploying Multi-Agent AI for ITSM: A Practical Framework with LUMOS

By Sam Qikaka

Category: Models & Releases

Learn a step-by-step framework for deploying a multi-agent AI system using LUMOS to automate IT service management (ITSM) operations, including agent role design, SLA-based escalation, and integration with ServiceNow or Jira Service Management.

Introduction IT service management (ITSM) teams face mounting pressure to resolve incidents faster, fulfill requests accurately, and keep knowledge bases current—all while maintaining compliance and audit trails. Traditional ITIL processes rely heavily on manual handoffs, which slow down response times and increase operational costs. Multi-agent AI systems offer a way to automate these workflows by assigning specialized AI agents to distinct tasks, coordinating their actions, and escalating to humans when necessary. LUMOS provides a multi-agent platform purpose-built for enterprise AI adoption, supporting Retrieval-Augmented Generation (RAG) and autonomous agents. This article presents a step-by-step framework for deploying a multi-agent AI system using LUMOS to automate ITSM operations, designed for B2B operations leaders evaluating AI for IT. Step 1: Define Agent Roles for ITSM Workflo

ws Start by mapping ITSM processes to discrete agent roles. LUMOS allows you to create and orchestrate specialized agents that share context but have distinct responsibilities. For a typical mid-market enterprise (500–1,000 employees, handling 200–500 tickets per week), consider these core roles: Incident Triage Agent - Purpose : Automatically categorize, prioritize, and assign incoming incidents based on impact, urgency, and historical patterns. - Inputs : Ticket title, description, user details, system logs (via API). - Outputs : Priority label (P1–P5), category (e.g., network, application, hardware), suggested assignee or team. - Technique : Uses a fine-tuned classifier or RAG query against past incident data. Request Fulfillment Agent - Purpose : Handle standard service requests (password resets, access provisioning, software installation) without human intervention. - Inputs : Reque

st form fields, user identity, service catalog. - Outputs : Fulfillment steps (e.g., run a script, update directory, send confirmation). - Technique : Executes predefined workflows or calls external APIs (Active Directory, cloud consoles). Knowledge Base Retrieval Agent - Purpose : Find relevant articles, runbooks, or known errors to assist agents and end users. - Inputs : Natural language query from a ticket or chat. - Outputs : Ranked list of knowledge base (KB) articles with relevance scores. - Technique : RAG with a vector index of KB content, updated in near real time. These agents operate under a coordinating agent that routes tasks and manages conversation context. Each LUMOS agent can be configured with its own LLM model, temperature, and system prompt. Step 2: Design Human-in-the-Loop Escalation Policies Tied to SLA Thresholds Automation must respect service-level agreements (SL

As). In ITSM, certain actions require human approval—especially for P1 incidents, security breaches, or non-standard requests. LUMOS supports escalation policies that trigger when an agent cannot resolve a ticket within a predefined time window or when confidence is low. Define SLA Tiers Priority Target Resolution Escalation Trigger ---------- ------------------- -------------------- P1 1 hour No resolution in 15 minutes - notify on-call team P2 4 hours Agent confidence < 0.8 - request human review P3 8 hours Agent unable to categorize - route to L2 team P4 24 hours User requests human contact - escalate (Adjust based on your organization’s SLAs.) Implement Escalation in LUMOS - Configure each agent to emit a “request escalation” event when thresholds are breached. - Create a human escalation queue (e.g., a Slack channel or ServiceNow assignment group) that receives ticket summaries and

agent reasoning. - Set fallback actions: If the escalation token expires (no human response within 10 minutes), the agent can try alternative resolution paths or reassign. Human-in-the-loop ensures compliance and builds trust in the AI system while allowing full automation for low-risk tasks. Step 3: Integrate with ServiceNow or Jira Service Management LUMOS connects to existing ITSM tools via REST APIs and webhooks. For a mid-market enterprise using ServiceNow or Jira Service Management, follow this integration pattern: ServiceNow Integration - Webhook Inbound : ServiceNow sends new ticket events (incident, request, problem) to a LUMOS webhook endpoint. - Agent Processing : The orchestrator agent assigns the ticket to the appropriate LUMOS agent. - Actions Back to ServiceNow : The agent updates the ticket (status, assignment, priority) and adds work notes. - Knowledge Sync : LUMOS perio

dically pulls updated KB articles from ServiceNow to refresh its RAG index. Jira Service Management Integration - Same pattern using Jira’s REST API and project automation rules. - Example : Create a Jira automation rule that calls a LUMOS API on issue creation, then updates the issue with the agent