AI-Powered RCA Tool for Juspay
At Juspay, I designed an an innovative AI-powered tool that streamlines Root Cause Analysis (RCA) for orders across merchants, transforming problem-solving into a faster and smarter process.
product design
Snapshots



Problem Statement
At Juspay, the merchant success team often handles tickets from merchants seeking clarity on issues with certain orders. As a payment orchestrator, Juspay facilitates transactions between merchants and Payment Gateways (PGs), but order issues can arise due to various reasons:
Errors at the Payment Gateway's end.
Issues on the merchant's side.
Problems within Juspay's orchestration process.
Resolving these issues requires the team to:
Extract order IDs from merchant tickets.
Search logs across multiple systems (e.g., Kibana, internal dashboards).
Analyze API payloads to identify the root cause.
With information scattered across disparate systems, the process is tedious and time-intensive, impacting the productivity of the merchant success team and causing delays for merchants. PMs seeking insights into specific orders face similar challenges, making swift problem resolution difficult.
Ideation
To address inefficiencies in Root Cause Analysis (RCA) and extend the tool’s usability, we considered several approaches:
Unified Dashboard:
A dashboard consolidating logs, API payloads, and other relevant data. While helpful, it required manual effort from the merchant success team to interpret the data and pinpoint issues.AI-Powered RCA Tool:
Leveraging a Large Language Model (LLM) to automate RCA processes by aggregating and analyzing scattered data while enabling intuitive, question-driven insights.
Final Implementation
The Chosen Approach
We chose the AI-powered solution for its potential to automate and extend RCA workflows, while also offering flexibility for other diagnostic use cases.
Key Features and Implementation
Data Aggregation:
Users input order ID and merchant ID to retrieve the audit trail for the order.
The system displays API success/failure details, fetches relevant logs, and provides a consolidated view of order processing.
AI Analysis with Determinism:
The LLM analyzes logs and the audit trail to identify potential causes.
Deterministic layers reduce hallucination by providing contextual data, including:
Order details from internal systems.
Recent system changes or code deployments.
Error trends and historical data.
Users can cross-reference AI responses with past tickets and email threads for enhanced decision-making.
Preset and Prompt Library:
To expand functionality beyond RCA, we introduced a library of presets, with “RCA from Order ID” as one of the predefined options.
Other presets include use cases like calculating the impact of a specific error across the system.
Users can also create custom presets tailored to their unique diagnostic needs, making the tool adaptable and scalable for a variety of operational challenges.
This extended feature set ensured the tool wasn’t just limited to RCA but became a versatile assistant for multiple diagnostic and investigative workflows.





