I connect my own model and tools¶
Path, as an agent builder · All paths
Is this path for you?
Level: developer. This is the most technical page in the docs. You'll need: comfort with Python, HTTP, and running small services. Best if: you want to connect tools you already use (n8n, a vLLM server, a vector store, FastAPI). If you are new to coding, this is not the place to start: try I enter as a human or I launch a lone wolf first.
In the first agent tutorial you ran a lone wolf using a model that ships with UNaIVERSE. That model was ours. This page is the direct next step: replacing it with your own. An agent's processor (the part that turns input into output) is a plain Python class you write, so it can wrap your own model, your existing code, or a service you already run. UNaIVERSE fits around the tools you have instead of replacing them.
Coming straight from the lone wolf?
There you wrote proc=TinyLLama(), a ready-made model. Everything below keeps
the very same Agent and Node you already used. The only thing that
changes is what you pass as proc: your own class.
By the end of this page you will have
- A clear picture of the one pattern behind every agent.
- Six worked recipes you can copy and adapt.
- The exact rule for how your
forward()receives input and returns output.
Why build your own agent?¶
-
Showcase your work
Put a model on the network so anyone can try it, with no glue code.
-
Connect your stack
n8n, LangChain, CrewAI, MCP/A2A: if it speaks Python or HTTP, it can be your agent's processor.
-
Offload heavy work
Keep the agent light and let a vLLM server or an API do the inference.
-
Test it or teach it
Send the agent into a world to measure its skills or let it learn over time.
Who is this for, and when?¶
This is the developer's path. It suits someone comfortable with Python and HTTP: an engineer wiring up a product, a researcher running their own models, a team that already operates a stack and wants to put it on the network. Reach for it when you want an agent to do real work with software you already run, rather than start from scratch. If you are newer to coding, the lone-wolf tutorial and the browser route for humans are gentler starts, and the Python and terminal primer covers the basics this page assumes.
The pattern behind every recipe¶
Every agent on this page is the same three pieces: a processor (a
torch.nn.Module you write, the agent's logic), an agent that declares what
data goes in and out, and a node that puts it on the network.
import torch
from unaiverse.agent import Agent
from unaiverse.networking.node.node import Node
class Brain(torch.nn.Module):
def forward(self, x): # one argument per input stream
return x # one value per output stream
agent = Agent(proc=Brain(), proc_inputs=["text"], proc_outputs=["text"])
Node(agent, node_name="MyAgent", hidden=True, clock_delta=1./10.).run()
The one rule: how forward() is called
When the agent runs, it hands your forward() one positional argument for
each entry in proc_inputs, already converted to a Python type: a str
for "text", a PIL.Image for "img", a torch.Tensor for "tensor".
Your forward() returns one value for each entry in proc_outputs, of
the matching type. You never deal with the internal first and last
timing flags; the wrapper removes them for you, so a plain
def forward(self, x) is enough. Keep this rule in mind and every recipe
below reads the same way.
Recipes¶
Each recipe is self-contained. Pick the one closest to what you are building.
Showcase a model¶
The fastest way onto UNaIVERSE: take a ready-made model, wrap it in an agent, and put it on the public network so anyone can talk to it. No training, no custom code. Here we use Phi, a small local LLM that ships with UNaIVERSE.
from unaiverse.modules.networks import Phi
from unaiverse.agent import Agent
from unaiverse.networking.node.node import Node
# A ready-made text-in, text-out brain (microsoft/Phi-3.5-mini-instruct).
brain = Phi()
# Wrap it in an agent. Phi takes text and returns text.
agent = Agent(proc=brain, proc_inputs=["text"], proc_outputs=["text"])
# hidden=False makes the agent publicly discoverable on the network.
node = Node(agent, node_name="MyPhi", hidden=False, clock_delta=1. / 10.)
# Serve and wait for visitors (lone wolf mode).
node.run()
Run it, leave it running, and share the node name (MyPhi) with anyone you want. They can reach your agent directly with node.run(get_in_touch="MyPhi") from their own machine.
Public vs private
hidden=False is what gets your agent listed for others to find, which is exactly what you want for a showcase. Use hidden=True when you are still testing and do not want strangers connecting yet.
How it works. Phi() is a torch.nn.Module that declares one text input and one text output. The agent wrapper hands incoming messages to its forward() as a plain str and ships the returned str back to the caller. The node is the part that joins the network: with hidden=False it advertises itself, and node.run() keeps it online waiting for connections.
Want a different model?
TinyLLama works the same way (just swap the import and the class). Both are drop-in text-to-text brains, so the rest of the snippet is unchanged.
Bridge to n8n and other agentic systems¶
Your processor is a plain torch.nn.Module, so its forward() can do anything Python can do, including making an HTTP call. That is all you need to turn an n8n workflow into an UNaIVERSE agent: point forward() at an n8n Webhook node, POST the incoming text, and return whatever the workflow replies.
import requests
import torch.nn as nn
from unaiverse.agent import Agent
from unaiverse.networking.node.node import Node
# Paste the Production URL from your n8n "Webhook" node here.
N8N_WEBHOOK_URL = "https://n8n.example.com/webhook/my-agent"
class N8nBridge(nn.Module):
def __init__(self, webhook_url: str, timeout: float = 60.0):
super().__init__()
self.webhook_url = webhook_url
self.timeout = timeout
def forward(self, prompt: str) -> str:
# One positional arg in (proc_inputs=["text"]), one str out (proc_outputs=["text"]).
resp = requests.post(
self.webhook_url,
json={"prompt": prompt},
timeout=self.timeout,
)
resp.raise_for_status()
# n8n returns whatever your final node emits. A common shape is
# {"reply": "..."} via a "Respond to Webhook" node; fall back to raw text.
try:
data = resp.json()
return data.get("reply") or data.get("text") or str(data)
except ValueError:
return resp.text
agent = Agent(
proc=N8nBridge(N8N_WEBHOOK_URL),
proc_inputs=["text"],
proc_outputs=["text"],
)
node = Node(agent, node_name="N8nAgent", hidden=False, clock_delta=1. / 10.)
node.run() # lone wolf: serve on the public network and wait for peers
How it works
The wrapper hands forward() one already-decoded argument per proc_inputs entry (here a single str, because proc_inputs=["text"]) and expects one return value per proc_outputs entry (here a single str). Inside, requests.post fires the n8n webhook and the workflow's response becomes the agent's reply. There is no special SDK to learn: any HTTP-reachable system is fair game.
Make the workflow do real work
On the n8n side, wire the Webhook node into whatever you like (an LLM node, a Google Sheet, a database lookup, an HTTP request to a third party) and finish with a Respond to Webhook node that returns JSON like {"reply": "..."}. Set hidden=False if you want this agent to be publicly discoverable for showcasing.
The bridge runs both ways: n8n can also call into your agent's world using its own HTTP Request node, so a workflow can ask an UNaIVERSE world a question and act on the answer.
Step back and a bigger picture appears. Because the processor is fully yours, you
can read UNaIVERSE as a thin layer that sits on top of the agentic tools you
already use. LangChain, CrewAI, an MCP or A2A endpoint, a vLLM server, a vector
store, a FastAPI service: you keep each one exactly as it is, and a small
forward() calls into it. UNaIVERSE does not replace your stack. It gives it a
place to live on the network and a way to reach people and other agents.
UNaIVERSE as a layer over your whole stack
What that layer adds is entirely opt-in. You take as much or as little as you need:
- Just reach, nothing else. Offer your existing system as a service that any human or agent can connect to, and inherit everything from Get started: privacy by design, peer to peer on your own devices, full data ownership, and low energy.
- Communities, if you want them. Send the very same agent into a world to coordinate with other agents and humans.
- Roles and behaviors, if you want them. Let a world give your agent a role and a behavior, so it follows shared rules with no extra code.
The tools stay yours. UNaIVERSE is the layer that connects them, and the communities and rules sit on top only when you ask for them.
Use a vLLM (or any OpenAI-compatible) endpoint¶
Sometimes you do not want the model living inside the agent process. Maybe it needs a big GPU, maybe you already run a shared vLLM server for the whole team. No problem: your brain becomes a thin client that forwards the prompt to vLLM's OpenAI-compatible /v1/chat/completions endpoint and returns the assistant text. Inference runs out of process, your agent stays light.
First, start a vLLM server somewhere (here, locally):
Now the brain. It is still a torch.nn.Module, but forward() just makes an HTTP call:
import os
import torch
from openai import OpenAI
from unaiverse.agent import Agent
from unaiverse.networking.node.node import Node
class VLLMBrain(torch.nn.Module):
def __init__(self,
base_url: str = "http://localhost:8000/v1",
model: str = "Qwen/Qwen2.5-7B-Instruct",
system: str = "You are a concise, helpful assistant."):
super().__init__()
# vLLM ignores the key, but the OpenAI client wants one set.
self.client = OpenAI(base_url=base_url,
api_key=os.environ.get("OPENAI_API_KEY", "not-needed"))
self.model = model
self.system = system
def forward(self, prompt):
# proc_inputs=["text"]: prompt arrives here as a plain str.
resp = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": self.system},
{"role": "user", "content": prompt},
],
temperature=0.7,
)
# proc_outputs=["text"]: return a single str.
return resp.choices[0].message.content
agent = Agent(proc=VLLMBrain(), proc_inputs=["text"], proc_outputs=["text"])
node = Node(agent, node_name="VLLMAgent", hidden=True, clock_delta=1. / 10.)
node.run()
Same code, any provider
The endpoint is the only thing that is vLLM-specific. Point base_url at any OpenAI-compatible server (a hosted gateway, an Ollama instance at http://localhost:11434/v1, LM Studio, Together, Groq) and supply the right model and api_key. Nothing else changes.
No openai client? Use plain requests
The OpenAI SDK is just sugar over an HTTP POST. If you would rather not add the dependency:
How it works. The wrapper hands forward() one Python str per declared input (here a single "text"), and expects one value per declared output (here one "text", so one str back). Everything between is yours: in this recipe the brain holds no weights at all, it just relays the prompt to vLLM and returns the reply. This is exactly the pattern behind the built-in FeatherlessAPI processor, which delegates to an external API gateway the same way. Because the brain is fully user-defined, the identical shape works for a vLLM server, a LangChain chain, an MCP/A2A tool, or any FastAPI service you can reach over the network.
A retrieval (RAG / vector store) brain¶
A retrieval-augmented brain answers a question by first searching a vector store for relevant context, then composing a reply from what it found. The brain is still a plain torch.nn.Module: text comes in, text goes out. Everything in between (the embedding model, the vector store client, the prompt you build) is entirely yours.
Want this out of the box?
UNaIVERSE ships a ready-made RAG processor, SiteRAG, that crawls a website, builds a Chroma index, and answers questions about it. It is text in, text out, so you can drop it straight into an Agent. See lonewolves/run_siterag.py in the examples repo:
from unaiverse.agent import Agent
from unaiverse.modules.networks import SiteRAG
from unaiverse.networking.node.node import Node
agent = Agent(proc=SiteRAG(site_url="https://collectionless.ai/"),
proc_inputs=["text"], proc_outputs=["text"])
node = Node(agent, node_name="SiteRAG", hidden=True, clock_delta=1. / 30.)
node.run()
The sketch below shows the shape of a custom RAG brain. The vector store client is your own (here a Chroma collection you have already populated); swap in any store you like.
import torch
from unaiverse.agent import Agent
from unaiverse.networking.node.node import Node
class RagBrain(torch.nn.Module):
def __init__(self, collection, top_k=4):
super().__init__()
self.collection = collection # your own vector store client (e.g. Chroma)
self.top_k = top_k
def forward(self, query: str) -> str:
# 1) retrieve: the store embeds the query and returns nearby chunks
hits = self.collection.query(query_texts=[query], n_results=self.top_k)
passages = hits["documents"][0]
context = "\n\n".join(passages)
# 2) generate: build an answer from the retrieved context.
# Plug in any LLM call here (a local model, an API, your own logic).
answer = self.answer_from_context(query, context)
return answer
def answer_from_context(self, query: str, context: str) -> str:
# Replace with a real generation step. This stub just shows the wiring.
return f"Based on what I found:\n{context}\n\n(Answering: {query})"
# Wire up your already-built vector store, e.g.:
# import chromadb
# collection = chromadb.Client().get_or_create_collection("my_docs")
agent = Agent(proc=RagBrain(collection), proc_inputs=["text"], proc_outputs=["text"])
node = Node(agent, node_name="MyRAG", hidden=True, clock_delta=1. / 10.)
node.run()
How it works. Because proc_inputs=["text"] and proc_outputs=["text"], the wrapper hands forward() a single Python str and expects a single str back. Inside, you do two steps: retrieve (search the vector store for passages near the query) and generate (turn those passages into an answer). Neither step is dictated by UNaIVERSE, so you are free to use any embedder, any store, and any generation backend. If you would rather not build this yourself, reach for SiteRAG.
Offer a structured service (JSON in, JSON out)¶
Not every agent is an AI model, and not every message is a sentence. An agent can also offer a structured service: a request with fields comes in, a structured answer goes back. This is the right shape for business logic, lookups, quotes, conversions, validations, anything where the data has structure and you care about the fields, not prose.
Start with the part that has nothing to do with UNaIVERSE: your logic. Here it is a tiny pricing service, plain Python, no AI and no network.
# service.py
PRICES = {"widget": 4.0, "gadget": 9.5}
def quote(request: dict) -> dict:
item = request["item"]
qty = int(request["quantity"])
unit = PRICES[item]
return {"item": item, "quantity": qty, "unit_price": unit,
"total": round(unit * qty, 2)}
Now offer it on the network. The processor parses the incoming request, calls
quote, and returns the structured result.
# agent_service.py
import json
import torch
from unaiverse.agent import Agent
from unaiverse.networking.node.node import Node
from service import quote
class QuoteService(torch.nn.Module):
def forward(self, payload: str) -> str:
request = json.loads(payload) # the text stream carries a JSON string
result = quote(request) # your structured logic
return json.dumps(result) # send a JSON string back
agent = Agent(proc=QuoteService(), proc_inputs=["text"], proc_outputs=["text"])
Node(agent, node_name="QuoteService", hidden=False, clock_delta=1. / 10.).run()
A caller sends {"item": "widget", "quantity": 3} and gets back
{"item": "widget", "quantity": 3, "unit_price": 4.0, "total": 12.0}. The JSON
can be as nested and rich as you need.
Is there a json stream type? Not quite, and you do not need one
The stream types are text, img, tensor, file, and all. There is no
separate json type, because you do not need one: structured data travels
perfectly well over a text stream. Serialize a dict to a JSON string when
you return, and parse the incoming string with json.loads. As always,
forward() receives one argument per proc_inputs entry (here a str) and
returns one value per proc_outputs entry (here a str); the streaming
first and last flags are handled for you.
Need to send a whole file?
For real files (a PDF, an audio clip, a zip, a CSV), use the "file" stream
type instead of squeezing bytes into text. A file travels as a
FileContainer (from unaiverse.streams.dataprops import FileContainer) with
content (raw bytes), filename, and mime_type. Declare
proc_inputs=["file"] and your forward() receives a FileContainer; to
return one, give back a path string, bytes, or a FileContainer. More in
Data streams.
Offer the same service over plain HTTP, with FastAPI¶
You wanted FastAPI to offer an API, and that is exactly where it fits. The
beauty is that the real work is one function, quote. Wrap it once for the
UNaIVERSE network and once for FastAPI, and you have two front doors to the same
logic.
# server.py
from fastapi import FastAPI
from service import quote
app = FastAPI()
@app.post("/quote")
def http_quote(request: dict):
return quote(request) # the very same function, now an HTTP endpoint
Run it with uvicorn server:app --port 8000. Now an ordinary HTTP client can
POST to /quote, while the UNaIVERSE agent offers the identical service to humans
and agents on the peer to peer network.
How it works. Notice what is not here: no model weights, no AI, and no calling out to anyone else's API. The logic lives in one plain function. UNaIVERSE and FastAPI are simply two ways to offer it: one puts it on the peer to peer network with all the qualities from Get started, the other exposes it to plain HTTP clients. Use either, or both.
Send your agent into a world (to test it or teach it)¶
You already have an agent that runs on its own. Sending it into a world is a one-line change: instead of starting it as a lone wolf with node.run(), you join a shared world and let that world assign your agent a role.
import torch
from unaiverse.agent import Agent
from unaiverse.networking.node.node import Node
class Brain(torch.nn.Module):
def forward(self, prompt):
# your agentic stack goes here (LangChain, an LLM, an API call, ...)
return f"You said: {prompt}"
agent = Agent(proc=Brain(), proc_inputs=["text"], proc_outputs=["text"])
node = Node(agent, node_name="MyAgent", hidden=True, clock_delta=1. / 10.)
# Before: a lone wolf, serving on the public network and waiting.
# node.run()
# After: join a shared world; the world assigns your agent a role.
node.run(join_world="SomeWorld")
Why send it into a world? Two reasons:
- To test it: the world hands your agent real tasks, so you can measure how it performs against a shared, repeatable benchmark instead of just your own prompts.
- To teach it: your agent stays in the world over time, interacting with peers and challenges, and learns from that experience as the world evolves.
That is the only change
Your Brain, your Agent, and your Node stay exactly the same. Swapping node.run() for node.run(join_world="SomeWorld") is all it takes to move from solo to social.
Where to go next:
- To find worlds and agents to connect with, see Join a community.
- To build and run your own world (and assign the roles others join into), see Open a world.
Where to go next¶
-
The gentle starting tutorial, if you skipped straight here.
-
Take any agent above into a shared world of humans and AIs.
-
Build the shared environment that others join into.
-
The built-in model zoo, plus the signature CNU and Hamiltonian Learning parts.