So you want to build an Agent?
I, like many others lately, have found myself pulled into the Generative AI vortex. I kept some distance from this space for a while, only trying out the big chatbots that came out, but a few months ago I was pulled in to a project involving LLMs, and needed to learn many things very quickly.
Fast forward, and I found myself getting frustrated at the esoteric nature of Generative AI app development, and to a lesser extent, the Python-centric tooling ecosystem. Python does make sense for many things as there’s a very robust data science and ML ecosystem, but I do not call myself a Pythonista. I’ve been almost exclusively a Go programmer for the better part of a decade, so I wanted to try my hand at building some Generative AI tooling in my lingua franca.
I’ve spent a few days exploring the patterns of “Agentic RAG”, which is the latest term for “LLMs using data and tools you provide to perform tasks”. I spent a free weekend building a tool to create an Agent and/or RAG workflow without requiring any code. I think that any developer will agree that after a few years supporting code your own self wrote more than a few months ago, writing no code is generally preferable 🙂
So with the goal of making experimentation (and iteration) with Agents and RAG as quick as possible, I wrote a tool call Ragoo.
With Ragoo, you can write a YAML file which defines the various parts of an Agent or RAG workflow, compose those parts together, and run the workflow without touching a programming language.
Pulling data into a Vector DB
importers:
- name: k8s-files
type: file
config:
directory: /Users/cohix-lab/workspaces/cohix/kubernetes-the-hard-way/docs/
steps:
- type: embedder
ref: ollama/arctic
action: generate
params:
input: $_chunk
var: embedding
- type: storage
ref: duckdb/main
action: insert.embedding
params:
embedding: $embedding
ref: $_ref
batch: $_batch
collection: k8s
This describes an Importer
, which pulls documents from a source, chunks them, and then runs a workflow. This example runs an Embedder
(which generates embeddings, or the vector representation of the text), and then inserts it into a database. I’m currently loving DuckDB, and it has built in vector abilities, so let’s use it.
Next up is a RAG workflow:
- name: k8s-docs
stages:
- name: k8s-docs-rag
steps:
- type: embedder
ref: ollama/arctic
action: generate
params:
input: $_input
var: embedding
- type: storage
ref: duckdb/main
action: lookup.cosine
params:
embedding: $embedding
collection: k8s
threshold: 0.65
limit: 2
var: refs
- type: importer
ref: k8s-files
action: resolve.refs
params:
refs: $refs
seperator: \n
var: context
- type: service
ref: ollama/llama
action: completion
params:
prompt: |
$context
----
Using the information above, answer the question below in 100 words or less.
If the answer is not contained entirely within the information provided, reply 'I do not know' without any additional text.
Only provide an answer to the question, do not summarize all of the information.
----
Question: $_input
var: _response
This workflow runs the Embedder
again (this time on a prompt), uses that vector to look up Document References
from the DuckDB database based on cosine similarity. The Importer
is used to resolve the Document References
into the files themselves, and then a template is used to send an “augmented prompt” to an LLM (Ollama running Llama3, in this case).
The full example (find it here) imports the documents from Kubernetes the hard way
and lets you ask questions about them. It works pretty well! One interesting thing I noticed is how much better Llama3 is at answering questions compared to Phi3.
I think this is useful as it allows you to easily run experiments with all the different parameters (like cosine similarity threshold, various prompt styles, etc), without needing to invest a ton of time or effort into learning every nuance of Agentic workflows and the myriad of libraries therein.
You can then run the whole thing using the ragoo
binary:
go install ./cmd/ragoo/
ragoo ./ragoo.yaml
The application starts up and provides an API endpoint which will run the workflow.
I think Agents are a cool idea, somewhat like a large state machine with super indeterminate state. It’s a fun problem, and can result in some really cool capabilities.
Let me know what you think (connor [at] cohix.network
or Linkedin). I’m going to continue weekend hacking on Ragoo, specifically adding more plugins and looking at adding Tools
(and experiment with tool_use
). I also want Ragoo to be useful as a Go package, which would allow you to experiment with an idea using YAML, and then ”drop down” into code if you want to take the idea further.