Imagine an insurance claim application for an insurance company. How awesome would it be if you could upload relevant documents, like an insurance claim and client details, and the application would autonomously, yet securely and with proper safeguards, perform all the necessary checks? It could analyze damage photos, check which insurance type is applicable, historical data, the drive type of the registered vehicle, and more. Let’s explore how AI Agents can help in such scenarios.
What are AI agents?
Although the word “agents” can currently be found in many products and announcements, AI agents are more of a concept than a product. We are now at the point where we want more than simple prompt-in-prompt-out interactions with LLMs and applications. We want our AI application to carry out tasks and actually do something for us. This is when the concept of agents comes into play.
In essence, AI agents are software components that perform tasks autonomously or semi-autonomously on behalf of users or in general. An agent can be as simple as a small script or more sophisticated logic including API’s and more. An AI application should be capable of determining the appropriate actions and triggering the relevant agent based on input prompts, often utilizing a large language model (LLM) to do so.
Examples of agent tasks:
- Interacting with API’s and databases, i.e. retrieving stock information from an inventory database
- Searching through a company’s knowledge base
- Drafting text with specific instructions
- Grounding answers using a web search API
- Retrieve current weather or stock information
- Extracting information from an image or document uploaded by an end user
- Parsing information and constructing certain schemas or format
- And much more
As previously noted, we can utilize an LLM to identify the user’s intent or other relevant inputs and then trigger an appropriate agent. Let’s assume we want to build a simple bot to interact with an HR application. The bot should be able to retrieve the number of overtime hours from an API. A simple workflow for such an application could look as follows:
Multi-agent orchestration
We could extend the above example with additional capabilities such as requesting vacation days, asking questions about HR policies, generating a summary of absences, and more. We would then have created a first bot that also does something for us. Now, let’s take it a step further and let our agents talk to each other and figure out how to solve a particular challenge themselves.
Take the example of an insurance claim application from the beginning of the post. We want to task our agent with analyzing insurance claims and creating a report based on the client’s policies and other information and calculations you typically need to work on such cases. Here, each claim has different characteristics and probably different procedures. Maybe we first need to find out what type of case it is, is it a car accident, property damage, illness, etc. Some might require analysis of damaged photos, while others might require data-driven probabilities or checks of the client’s policies against a database. Our goal is to feed information about the case into our AI application and let the application and its agent figure out the following steps:
- What is the goal of the request
- What information do I need to get to the goal
- Plan: Which agents need to be chained together to get the information
- Run the plan
Example
Let’s assume our client drove up the garage driveway a little too quickly and damaged his own garage wall with his car. We first have to check which insurance policies cover which part of the various damages (car and property) and whether the client is even insured accordingly. Furthermore, it also plays a role, for example, which vehicle model and time of the year (weather conditions) the damage happened. Various things have to be checked and compared with each other. This is where an agent orchestration layer can automate things in a very efficient way.
Conclusion
AI agents is a concept that enables a deeper integration of LLM’s into business processes and automation. By giving agents capabilities and making them aware of their capabilities, multi-agent architecture brings another piece to the puzzle: Autonomy. See it like a small office where each employee has their capabilities and strengths, they have to work together in an efficient way to reach the goal. It’s the same principle that goes for multi-agent architectures.
I plan to write more about how such orchestration works using frameworks like Semantic Kernel, and how it works behind the scenes. Until then, cheers!