Episode 3: Two agents, fully automatic¶
Quickstart · Episode 3 of 4
Now the real magic: two AI agents that find each other and exchange data with no human involved, and a new kind of brain that works with images instead of text.
The idea¶
We'll host an image classifier (it looks at a picture and guesses what's in it). Then we'll build a second agent that generates pictures and sends them over for classification, printing the answers. Two programs, talking automatically.
What's a ResNet / image classifier?
ResNet is a famous, ready-made image-recognition model. Show it a photo and it guesses what's in it, choosing among 1,000 everyday categories (Egyptian cat, sports car, banana, …). It's the vision-world equivalent of the language model from Episode 1, same idea, different senses.
What's a tensor, and what do those numbers in the code mean?
A tensor is just a grid of numbers. An image is turned into a tensor so a
model can read it: a batch of images has the shape
(how_many, 3, height, width), 3 is the red/green/blue colour channels.
When you see (None, 3, None, None), None means "any size is fine on this
dimension". Don't overthink it, it's a label describing the data's shape.
Full story in Data streams.
Step 3a, the classifier¶
In a terminal, create classifier.py:
import torch
import torchvision
from unaiverse.agent import Agent
from unaiverse.streams.dataprops import StreamType
from unaiverse.networking.node.node import Node
# A ready-made image classifier (downloads pretrained weights the first time).
brain = torchvision.models.resnet50(weights="IMAGENET1K_V1").eval()
# This agent accepts images-as-tensors and returns 1000 scores (one per category).
agent = Agent(
proc=brain,
proc_inputs=[StreamType(data_type="tensor",
tensor_shape=(None, 3, None, None), # any batch, RGB, any size
tensor_dtype=torch.float32)],
proc_outputs=[StreamType(data_type="tensor",
tensor_shape=(None, 1000), # 1000 category scores
tensor_dtype=torch.float32)],
)
node = Node(agent, node_name="MyClassifier", hidden=True, clock_delta=1./5.)
node.run() # lone wolf: sit and wait for images
Run it and leave it running:
Why so much more detail than Episode 1?
Text was simple, ["text"] said it all. Images need a precise shape and
number type, so we use the full StreamType descriptor instead of the
shorthand. It's the same idea as ["text"], just spelled out.
Step 3b, the generator¶
In a second terminal, create generator.py:
import torch
from unaiverse.agent import Agent
from unaiverse.streams.dataprops import StreamType
from unaiverse.networking.node.node import Node
# A tiny "brain" that invents a random picture each time it's asked.
class PictureMaker(torch.nn.Module):
def forward(self, x=None):
return torch.rand((1, 3, 224, 224), dtype=torch.float32) # one random 224×224 image
agent = Agent(
proc=PictureMaker(),
proc_inputs=[StreamType(data_type="all")], # it ignores any input
proc_outputs=[StreamType(data_type="tensor",
tensor_shape=(1, 3, 224, 224),
tensor_dtype=torch.float32)],
)
# After each round, peek at what the classifier sent back and print the winner.
def on_cycle(node: Node):
result = node.agent.get_last_streamed_data("MyClassifier")
if result and result[0] is not None:
top_category = int(result[0].argmax(dim=1)[0])
print(f"The classifier's top guess: category #{top_category}")
node = Node(agent, node_name="Generator", hidden=True,
clock_delta=1./5., run_hook=on_cycle)
# Connect to the classifier and run for 10 seconds, then stop.
node.run(get_in_touch="MyClassifier", max_time=10.0)
Run it:
For about ten seconds you'll see guesses scroll by. Generator is inventing
images, sending them to MyClassifier, and reading back its answers, entirely
on its own.
Why are the guesses random nonsense?
Because we're sending random pixels, not real photos! The point of this episode is the plumbing, two agents exchanging real data over the network, not the accuracy. Feed it real images and you'd get real predictions.
What just happened¶
sequenceDiagram
participant G as Generator
participant Net as P2P network
participant C as MyClassifier
G->>Net: get_in_touch("MyClassifier")
Net-->>G: connected
loop every tick, for 10s
G->>C: an image (tensor 1×3×224×224)
C->>C: ResNet looks at it
C-->>G: 1000 scores (tensor 1×1000)
G->>G: print the top category
end
- Two independent programs, possibly on two machines, found each other and exchanged data with no human and no central server.
run_hook=on_cycleran your function every tick, that's how you add custom automation around an agent.get_last_streamed_data("MyClassifier")read what the classifier sent back.- The stream types matched (
MyClassifierwants(…,3,…,…)images and returns(…,1000)scores) so the connection just worked.
Episode 3 recap
You connected two AI agents, exchanged typed data automatically, met a second kind of model (vision), and added your own logic with a run hook. This is the core of UNaIVERSE in miniature.