unaiverse.modules.networks
What this module does 🔴
Central catalog of ready-to-use neural network architectures wrapped as ModuleWrapper subclasses: RNN/state-space token language models, CNU-augmented CNNs, torchvision backbones (ResNet, ViT, DenseNet, EfficientNet, FasterRCNN), HuggingFace LLMs/VLMs, and API-backed model wrappers.
networks
¶
█████ █████ ██████ █████ █████ █████ █████ ██████████ ███████████ █████████ ██████████
░░███ ░░███ ░░██████ ░░███ ░░███ ░░███ ░░███ ░░███░░░░░█░░███░░░░░███ ███░░░░░███░░███░░░░░█
░███ ░███ ░███░███ ░███ ██████ ░███ ░███ ░███ ░███ █ ░ ░███ ░███ ░███ ░░░ ░███ █ ░
░███ ░███ ░███░░███░███ ░░░░░███ ░███ ░███ ░███ ░██████ ░██████████ ░░█████████ ░██████
░███ ░███ ░███ ░░██████ ███████ ░███ ░░███ ███ ░███░░█ ░███░░░░░███ ░░░░░░░░███ ░███░░█
░███ ░███ ░███ ░░█████ ███░░███ ░███ ░░░█████░ ░███ ░ █ ░███ ░███ ███ ░███ ░███ ░ █
░░████████ █████ ░░█████░░████████ █████ ░░███ ██████████ █████ █████░░█████████ ██████████
░░░░░░░░ ░░░░░ ░░░░░ ░░░░░░░░ ░░░░░ ░░░ ░░░░░░░░░░ ░░░░░ ░░░░░ ░░░░░░░░░ ░░░░░░░░░░
A Collectionless AI Project (https://collectionless.ai)
Registration/Login: https://unaiverse.io
Code Repositories: https://github.com/collectionlessai/
Main Developers: Stefano Melacci (Project Leader), Christian Di Maio, Tommaso Guidi
RNNTokenLM
¶
RNNTokenLM(num_emb: int, emb_dim: int, y_dim: int, h_dim: int, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
Token-level language model backed by a single-layer Elman RNN.
At each time step the network embeds the previously predicted token, applies the Elman
recurrence h = tanh(A h + B u), and projects the hidden state to logit space via
C. The embedding, recurrence, and projection matrices (A, B, C) are
plain torch.nn.Linear layers with no bias. The initial hidden state h_init is
drawn from a standard Normal distribution at construction time and stored as a plain
tensor attribute (not a registered buffer), while u_init is initialised to zeros.
On the first step (first=True) the network uses h_init and u_init; on
subsequent steps it detaches both the previous hidden state and the argmax of the
previous output to avoid backpropagating through time across calls.
This class wraps the inner Net as a ModuleWrapper, so all ModuleWrapper
machinery (device handling, stream-based I/O descriptors, optional learning support) is
available. The processor input is a single scalar torch.long token index; the
processor output is a y_dim-dimensional torch.float32 logit vector.
Examples:
>>> lm = RNNTokenLM(num_emb=256, emb_dim=32, y_dim=256, h_dim=128)
>>> # Single autoregressive step (first call):
>>> import torch
>>> logits = lm.module(first=True) # returns tensor of shape (1, 256)
>>> # Continue from the previous state:
>>> logits = lm.module(first=False)
Initialize an RNNTokenLM with the given vocabulary and architecture sizes.
Builds the inner Net (embedding layer, three weight matrices, and initial
state tensors), then calls ModuleWrapper.__init__ with stream descriptors
derived from the architecture: one scalar torch.long input stream and one
y_dim-dimensional torch.float32 output stream.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_emb
|
int
|
Vocabulary size; the number of rows in the embedding table. |
required |
emb_dim
|
int
|
Dimensionality of each token embedding vector. |
required |
y_dim
|
int
|
Dimensionality of the output logit vector (equals |
required |
h_dim
|
int
|
Dimensionality of the hidden state vector. |
required |
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
RNN
¶
Bases: ModuleWrapper
Single-layer Elman RNN with a combined main-input and descriptor-input port.
Implements the recurrence h = tanh(A h + B [du; u]) where [du; u] is the
concatenation of a descriptor (delta-u) vector and the flattened main input, and y =
C h. All three weight matrices are bias-free torch.nn.Linear layers. The initial
hidden state h_init is registered as a buffer (so it moves to the correct device
automatically) and is used only when first=True; subsequent steps detach the
previous hidden state and reuse it.
The processor input signature follows the UNaIVERSE RNN convention produced by
get_proc_inputs_and_proc_outputs_for_rnn: two input streams (main tensor of shape
u_shape and descriptor tensor of size d_dim) and one output stream of size
y_dim.
Examples:
>>> rnn = RNN(u_shape=(16,), d_dim=8, y_dim=4, h_dim=64)
>>> import torch
>>> u = torch.randn(1, 16)
>>> du = torch.randn(1, 8)
>>> y = rnn.module(u, du, first=True) # returns tensor of shape (1, 4)
>>> y = rnn.module(u, du, first=False) # continues from detached hidden state
Initialize an RNN module with the given architecture sizes.
Computes the flat input dimensionality from u_shape, builds the inner Net
(matrices A, B, C and registered buffer h_init), then delegates to
ModuleWrapper.__init__ using stream descriptors generated by
get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension (e.g.
|
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input stream. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. |
required |
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
CSSM
¶
CSSM(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, sigma: Callable = tanh, project_every: int = 0, local: bool = False, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
Continuous-State Space Model: a linear recurrence with a configurable activation.
Implements the update h_new = A h + B [du; u] followed by the output projection
y = C sigma(h), where A and B are dense torch.nn.Linear layers. The
hidden state is stored across steps in the registered buffer h_next; on the first
step h_init (also a registered buffer) is used instead.
Two operating modes are supported via the local flag:
local=False(default): the hidden state exposed externally ish = h_new(post-update). The gradientdhis the discrete difference(h - h_prev) / delta.local=True: the hidden state exposed externally ish = h_prev(pre-update). The gradientdhis the discrete difference(h_new - h) / delta.
When project_every > 0 the adjust_eigs hook is called every
project_every forward steps. In this base class adjust_eigs is a no-op;
subclasses override it to constrain the spectrum of A.
The processor I/O streams follow the RNN convention: two inputs (main tensor and descriptor) and one output tensor.
Examples:
>>> import torch
>>> cssm = CSSM(u_shape=(8,), d_dim=4, y_dim=3, h_dim=32, batch_size=2)
>>> u = torch.randn(2, 8)
>>> du = torch.randn(2, 4)
>>> y = cssm.module(u, du, first=True) # tensor of shape (2, 3)
>>> y = cssm.module(u, du, first=False) # continues from stored h_next
Initialize a CSSM module with the given architecture and dynamics options.
Builds the inner Net (matrices A, B, C; registered buffers
h_init and h_next; control attributes), then delegates to
ModuleWrapper.__init__ using stream descriptors from
get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. |
required |
sigma
|
Callable
|
Element-wise activation applied to the hidden state before the output
projection. Defaults to |
tanh
|
project_every
|
int
|
If positive, call |
0
|
local
|
bool
|
If |
False
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 | |
CDiagR
¶
CDiagR(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, sigma: Callable = lambda x: x, project_every: int = 0, local: bool = False, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
State-space model with a real-valued diagonal recurrence matrix.
Replaces the dense square matrix A used in CSSM with a diagonal one
parameterized as a single torch.nn.Linear layer (diag) mapping a constant
scalar 1 to h_dim values. The recurrence is therefore
``h_new = diag_weights * h + B [du; u]``
where the element-wise multiplication uses the learned diagonal coefficients stored in
diag.weight. The output projection and activation follow the same pattern as
CSSM: y = C sigma(h).
When project_every > 0, adjust_eigs projects each diagonal entry to its sign
(i.e. clips weights to {-1, +1}), enforcing unit-modulus eigenvalues on the
diagonal recurrence.
All weight matrices (diag, B, C) use torch.float32. The hidden-state
buffers h_init and h_next are registered buffers and move with the module to
the correct device.
Examples:
>>> import torch
>>> cdr = CDiagR(u_shape=(10,), d_dim=5, y_dim=4, h_dim=64)
>>> u = torch.randn(1, 10)
>>> du = torch.randn(1, 5)
>>> y = cdr.module(u, du, first=True) # tensor of shape (1, 4)
Initialize a CDiagR module with a real diagonal recurrence.
Builds the inner Net (diagonal linear layer diag, input matrix B,
output matrix C, and registered buffers h_init / h_next), then
delegates to ModuleWrapper.__init__ using stream descriptors from
get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state (and the diagonal recurrence). |
required |
sigma
|
Callable
|
Element-wise activation applied to the hidden state before the output projection. Defaults to the identity function. |
lambda x: x
|
project_every
|
int
|
If positive, snap the diagonal weights to their sign every
this many forward steps to enforce unit-modulus eigenvalues. A value of
|
0
|
local
|
bool
|
If |
False
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 | |
CDiagC
¶
CDiagC(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, sigma: Callable = lambda x: x, project_every: int = 0, local: bool = False, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
State-space model with a complex-valued diagonal recurrence matrix.
Identical in structure to CDiagR but promotes all weight matrices (diag,
B, C) to torch.cfloat (complex float). The recurrence is
``h_new = diag_weights * h + B [du; u]``
with complex arithmetic throughout. The output y is the real part of C
sigma(h), ensuring the output stream remains real-valued.
When project_every > 0, adjust_eigs normalizes each complex diagonal entry to
unit modulus (diag.weight /= |diag.weight|), keeping all eigenvalues on the unit
circle in the complex plane.
The hidden-state buffers h_init and h_next are real-valued registered buffers
(they are cast to complex inside the forward pass as needed).
Examples:
>>> import torch
>>> cdc = CDiagC(u_shape=(10,), d_dim=5, y_dim=4, h_dim=64)
>>> u = torch.randn(1, 10)
>>> du = torch.randn(1, 5)
>>> y = cdc.module(u, du, first=True) # real tensor of shape (1, 4)
Initialize a CDiagC module with a complex diagonal recurrence.
Builds the inner Net (complex-typed diag, B, C layers, and
real-typed registered buffers h_init / h_next), then delegates to
ModuleWrapper.__init__ using stream descriptors from
get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input. |
required |
y_dim
|
int
|
Dimensionality of the output tensor (real-valued). |
required |
h_dim
|
int
|
Dimensionality of the hidden state (complex-valued). |
required |
sigma
|
Callable
|
Element-wise activation applied to the complex hidden state before the output projection. Defaults to the identity function. |
lambda x: x
|
project_every
|
int
|
If positive, normalize each complex diagonal weight to unit
modulus every this many forward steps. A value of |
0
|
local
|
bool
|
If |
False
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 | |
CTE
¶
CTE(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, delta: float, sigma: Callable = lambda x: x, project_every: int = 0, local: bool = False, cnu_memories: int = 0, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
Antisymmetric matrix-exponential state-space model (CTE).
Implements the continuous-time exact (CTE) discretization of a linear state-space
model whose recurrence matrix is constrained to be antisymmetric. The skew-symmetric
matrix A = 0.5 * (W - W^T) is exponentiated with the matrix exponential, and the
input is integrated via the zero-order-hold formula
``h_new = exp(A * delta) * h + A^{-1} * (exp(A * delta) - I) * B [du; u]``
This ensures that all eigenvalues of the recurrence lie on the unit circle in the complex plane, providing inherently stable hidden-state dynamics.
The output is computed as y = C sigma(h) where C is either a standard
torch.nn.Linear layer (when cnu_memories <= 0) or a LinearCNU layer (when
cnu_memories > 0) for memory-augmented readout. The inner module is _CTENet;
see its docstring for the full forward-pass specification.
The local and project_every flags share the same semantics as in CSSM.
Examples:
>>> import torch
>>> cte = CTE(u_shape=(8,), d_dim=4, y_dim=3, h_dim=32, delta=0.1)
>>> u = torch.randn(1, 8)
>>> du = torch.randn(1, 4)
>>> y = cte.module(u, du, first=True) # tensor of shape (1, 3)
>>> y = cte.module(u, du, first=False) # continues from stored h_next
Initialize a CTE module with antisymmetric matrix-exponential dynamics.
Computes the flat input dimension from u_shape, builds a _CTENet inner
module, then delegates to ModuleWrapper.__init__ using stream descriptors from
get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. |
required |
delta
|
float
|
Discrete time step used in the matrix-exponential update. Larger values correspond to coarser time discretization. |
required |
sigma
|
Callable
|
Element-wise activation applied to the hidden state before the output projection. Defaults to the identity function. |
lambda x: x
|
project_every
|
int
|
If positive, call |
0
|
local
|
bool
|
If |
False
|
cnu_memories
|
int
|
If positive, replace the linear output projection with a
|
0
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
CTEInitStateBZeroInput
¶
CTEInitStateBZeroInput(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, delta: float, sigma: Callable = lambda x: x, project_every: int = 0, local: bool = False, cnu_memories: int = 0, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
CTE variant that initializes the hidden state from the input and zeroes inputs after the first step.
Specializes _CTENet in two ways:
init_h: the initial hidden state is set toB(udu) / sum(udu)rather than the randomh_initbuffer, so the first hidden state is derived directly from the concatenated input[du; u].handle_inputs: on every forward step (including the first) bothduanduare replaced with zero tensors of matching shape, so the recurrence after initialization is input-free (driven purely by the antisymmetric dynamics).
All other aspects (matrix-exponential update, output projection, local mode,
project_every projection, cnu_memories readout) are identical to CTE.
Initialize a CTEInitStateBZeroInput module.
Builds a specialized _CTENet subclass (Net) that overrides init_h and
handle_inputs, then delegates to ModuleWrapper.__init__ using stream
descriptors from get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. |
required |
delta
|
float
|
Discrete time step used in the matrix-exponential update. |
required |
sigma
|
Callable
|
Element-wise activation applied to the hidden state before the output projection. Defaults to the identity function. |
lambda x: x
|
project_every
|
int
|
If positive, call |
0
|
local
|
bool
|
If |
False
|
cnu_memories
|
int
|
If positive, use a |
0
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
CTEToken
¶
Bases: ModuleWrapper
Token-level variant of CTE with a learned embedding lookup before the recurrence.
Specializes _CTENet by prepending a torch.nn.Embedding layer: when the main
input u is not None, it is first passed through self.embeddings before
entering the standard _CTENet.forward logic. This makes CTEToken suitable for
sequence-to-sequence or language-modelling tasks where inputs are integer token indices
rather than continuous vectors.
Architecture parameters are fixed at construction: delta=1.0, sigma=identity,
project_every=0, local=False, cnu_memories=0, batch_size=1. These
cannot be overridden via constructor arguments; use CTE directly for full control.
The processor input signature matches the standard RNN convention for embedding size
emb_dim (two input streams - a emb_dim-dimensional float tensor and a
d_dim-dimensional descriptor - and one output stream of size y_dim).
Examples:
>>> import torch
>>> cte_tok = CTEToken(num_emb=128, emb_dim=16, d_dim=8, y_dim=128, h_dim=64)
>>> token_ids = torch.tensor([[42]]) # shape (1, 1)
>>> du = torch.randn(1, 8)
>>> logits = cte_tok.module(token_ids, du, first=True) # shape (1, 128)
Initialize a CTEToken module with an embedding table and CTE dynamics.
Builds a _CTENet subclass (Net) that adds a torch.nn.Embedding layer
and overrides forward to embed integer token inputs before the recurrence.
The inner _CTENet is constructed with fixed hyperparameters (delta=1.0,
identity activation, no projection, global mode, no CNU memories, batch size 1).
Stream descriptors are generated by get_proc_inputs_and_proc_outputs_for_rnn
for an input shape of (emb_dim,).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_emb
|
int
|
Vocabulary size; the number of rows in the embedding table. |
required |
emb_dim
|
int
|
Dimensionality of each token embedding vector. This determines the
effective |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input stream. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. |
required |
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
CTB
¶
CTB(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, delta: float = 0.1, alpha: float = 0.0, sigma: Callable = lambda x: x, project_every: int = 0, local: bool = False, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
Block-structured state-space model with 2x2 antisymmetric rotation blocks.
Implements a structured linear recurrence whose recurrence matrix is block-diagonal with 2x2 blocks of the form
``[[1 - delta*alpha, delta*omega], [-delta*omega, 1 - delta*alpha]]``
where omega is a per-block learnable frequency parameter and alpha is a
dissipation coefficient. This parameterization is a first-order Euler approximation to
exact block-rotation dynamics and is significantly cheaper to compute than the full
matrix exponential used by CTBE.
Three eigenvalue projection modes are selected via the sign of alpha at construction
time:
alpha > 0: constant dissipation mode (project_method = 'const'). The alpha buffer is fixed to the given value.alpha == 0: unit-modulus mode (project_method = 'modulus'). Whenproject_every > 0,adjust_eigsnormalizes each block's[ones, omega]pair to unit modulus.alpha == -1: adaptive alpha mode (project_method = 'alpha'). Whenproject_every > 0,adjust_eigscomputesalphafrom the currentomegaanddeltato keep eigenvalues on the unit circle.
The hidden dimension h_dim must be even because the state is partitioned into
h_dim // 2 2x2 blocks. Raises AssertionError at construction if h_dim is
odd.
The processor I/O convention follows the standard RNN pattern: two input streams (main
tensor of shape u_shape and descriptor of size d_dim) and one output stream of
size y_dim.
Examples:
>>> import torch
>>> ctb = CTB(u_shape=(8,), d_dim=4, y_dim=3, h_dim=32, delta=0.1)
>>> u = torch.randn(1, 8)
>>> du = torch.randn(1, 4)
>>> y = ctb.module(u, du, first=True) # tensor of shape (1, 3)
>>> y = ctb.module(u, du, first=False) # continues from stored h_next
Initialize a CTB module with block-rotation dynamics.
Validates that h_dim is even, then builds the inner Net (learnable
omega frequency vector, buffers alpha and ones, input matrix B,
output matrix C, and registered hidden-state buffers h_init / h_next).
Delegates to ModuleWrapper.__init__ using stream descriptors from
get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input stream. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. Must be even (2x2 block structure). |
required |
delta
|
float
|
Discrete time step for the first-order rotation update. Defaults to 0.1. |
0.1
|
alpha
|
float
|
Dissipation coefficient and projection mode selector. Positive values
set constant dissipation; zero selects unit-modulus projection; |
0.0
|
sigma
|
Callable
|
Element-wise activation applied to the hidden state before the output projection. Defaults to the identity function. |
lambda x: x
|
project_every
|
int
|
If positive, call |
0
|
local
|
bool
|
If |
False
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
If |
Source code in unaiverse/modules/networks.py
906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 | |
CTBE
¶
CTBE(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, delta: float, sigma: Callable = lambda x: x, project_every: int = 0, local: bool = False, cnu_memories: int = 0, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
Block-structured state-space model with exact trigonometric rotation.
Implements a 2x2-block antisymmetric recurrence using the exact matrix-exponential
solution for each block. For each block k, the recurrence is
``h1_new = cos(omega_k * delta) * h1 + sin(omega_k * delta) * h2 + inp1``
``h2_new = -sin(omega_k * delta) * h1 + cos(omega_k * delta) * h2 + inp2``
where the input terms inp1 and inp2 are derived from the zero-order-hold
integral of the input matrix B, ensuring that all eigenvalues of the recurrence
lie exactly on the unit circle. This is the exact-discretization counterpart of the
first-order approximation used by CTB.
The hidden dimension h_dim must be even. Optional memory-augmented output is
supported via a LinearCNU readout layer when cnu_memories > 0. The inner
module is _CTBENet; see its docstring for the full forward-pass specification.
The local and project_every flags share the same semantics as in CSSM.
Examples:
>>> import torch
>>> ctbe = CTBE(u_shape=(8,), d_dim=4, y_dim=3, h_dim=32, delta=0.1)
>>> u = torch.randn(1, 8)
>>> du = torch.randn(1, 4)
>>> y = ctbe.module(u, du, first=True) # tensor of shape (1, 3)
>>> y = ctbe.module(u, du, first=False) # continues from stored h_next
Initialize a CTBE module with exact trigonometric block-rotation dynamics.
Computes the flat input dimension from u_shape, builds a _CTBENet inner
module (which asserts that h_dim is even), then delegates to
ModuleWrapper.__init__ using stream descriptors from
get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input stream. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. Must be even (2x2 block structure). |
required |
delta
|
float
|
Discrete time step passed to the trigonometric rotation formula. |
required |
sigma
|
Callable
|
Element-wise activation applied to the hidden state before the output projection. Defaults to the identity function. |
lambda x: x
|
project_every
|
int
|
If positive, call |
0
|
local
|
bool
|
If |
False
|
cnu_memories
|
int
|
If positive, replace the linear output projection with a
|
0
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
If |
Source code in unaiverse/modules/networks.py
CTBEInitStateBZeroInput
¶
CTBEInitStateBZeroInput(u_shape: tuple[int], d_dim: int, y_dim: int, h_dim: int, delta: float, sigma: Callable = lambda x: x, project_every: int = 0, local: bool = False, cnu_memories: int = 0, batch_size: int = 1, *args, **kwargs)
Bases: ModuleWrapper
CTBE variant that initializes the hidden state from the input and zeroes inputs after the first step.
Specializes _CTBENet in two ways:
init_h: the initial hidden state is set toB(udu) / sum(udu)rather than the randomh_initbuffer, so the first hidden state is derived directly from the concatenated input[du; u].handle_inputs: on every forward step (including the first) bothduanduare replaced with zero tensors of matching shape, so the recurrence after initialization is input-free (driven purely by the trigonometric rotation).
All other aspects (exact cosine/sine block rotation, output projection, local mode,
project_every projection, cnu_memories readout) are identical to CTBE.
Initialize a CTBEInitStateBZeroInput module.
Builds a specialized _CTBENet subclass (Net) that overrides init_h
and handle_inputs, then delegates to ModuleWrapper.__init__ using stream
descriptors from get_proc_inputs_and_proc_outputs_for_rnn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
u_shape
|
tuple[int]
|
Shape of the main input tensor, excluding the batch dimension. |
required |
d_dim
|
int
|
Dimensionality of the secondary descriptor (delta-u) input stream. |
required |
y_dim
|
int
|
Dimensionality of the output tensor. |
required |
h_dim
|
int
|
Dimensionality of the hidden state. Must be even (2x2 block structure). |
required |
delta
|
float
|
Discrete time step passed to the trigonometric rotation formula. |
required |
sigma
|
Callable
|
Element-wise activation applied to the hidden state before the output projection. Defaults to the identity function. |
lambda x: x
|
project_every
|
int
|
If positive, call |
0
|
local
|
bool
|
If |
False
|
cnu_memories
|
int
|
If positive, use a |
0
|
batch_size
|
int
|
Number of sequences processed in parallel. Defaults to 1. |
1
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
If |
Source code in unaiverse/modules/networks.py
CNN
¶
Bases: ModuleWrapper
Convolutional image-feature extractor with a sigmoid-activated output.
Implements a three-block convolutional backbone followed by a fully-connected head.
Each convolutional block consists of a Conv2d layer, a ReLU activation, and
an AvgPool2d downsampling step. The final feature vector is produced by a lazy
linear layer (2048 units, ReLU) followed by a Linear(2048, d_dim) projection and
a Sigmoid activation, so every output element lies in (0, 1).
Input transforms (resize, crop, normalization) are selected automatically via
transforms_factory based on in_channels and in_res:
in_channels == 3:"rgb<in_res>"transform (e.g."rgb32").- otherwise:
"gray<in_res>"transform (e.g."gray32").
The processor I/O streams are configured by
get_proc_inputs_and_proc_outputs_for_image_classification: one image input stream
and one d_dim-dimensional float output stream.
Examples:
>>> cnn = CNN(d_dim=64, in_channels=3, in_res=32)
>>> import torch
>>> # Process a random RGB image tensor (batch of 1):
>>> img = torch.randn(3, 32, 32)
>>> # (Actual inference is done through the processor pipeline, not directly here.)
Initialize a CNN feature extractor.
Builds the convolutional backbone and fully-connected head, generates input
transforms from transforms_factory, and delegates to
ModuleWrapper.__init__ with stream descriptors from
get_proc_inputs_and_proc_outputs_for_image_classification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_dim
|
int
|
Dimensionality of the output feature vector (number of sigmoid units). |
required |
in_channels
|
int
|
Number of input image channels (3 for RGB, 1 for grayscale). Defaults to 3. |
3
|
in_res
|
int
|
Spatial resolution (height and width) of the input image in pixels. Determines which transform preset is loaded. Defaults to 32. |
32
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
CNNCNU
¶
CNNCNU(d_dim: int, cnu_memories: int, in_channels: int = 3, in_res: int = 32, delta: int = 1, scramble: bool = False, *args, **kwargs)
Bases: ModuleWrapper
Convolutional image-feature extractor with a LinearCNU memory-augmented head.
Shares the same three-block convolutional backbone as CNN (Conv2d -> ReLU ->
AvgPool2d, repeated three times, followed by a lazy linear layer with 2048 units and
ReLU), but replaces the final torch.nn.Linear projection with a LinearCNU
layer. The LinearCNU head uses a content-addressable key-value memory of
cnu_memories slots to produce contextually adapted feature vectors, and its output
is passed through a Sigmoid activation.
Input transforms are selected by transforms_factory in the same way as CNN.
The processor I/O is configured by
get_proc_inputs_and_proc_outputs_for_image_classification.
Initialize a CNNCNU feature extractor with a memory-augmented head.
Builds the convolutional backbone with a LinearCNU output layer, generates
input transforms from transforms_factory, and delegates to
ModuleWrapper.__init__ with stream descriptors from
get_proc_inputs_and_proc_outputs_for_image_classification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_dim
|
int
|
Dimensionality of the output feature vector (number of sigmoid units). |
required |
cnu_memories
|
int
|
Number of key-value memory slots in the |
required |
in_channels
|
int
|
Number of input image channels (3 for RGB, 1 for grayscale). Defaults to 3. |
3
|
in_res
|
int
|
Spatial resolution (height and width) of the input image in pixels. Defaults to 32. |
32
|
delta
|
int
|
Delta hyperparameter passed to |
1
|
scramble
|
bool
|
If |
False
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
SingleLayerCNU
¶
SingleLayerCNU(d_dim: int, cnu_memories: int, in_channels: int = 3, in_res: int = 32, delta: int = 1, scramble: bool = False, *args, **kwargs)
Bases: ModuleWrapper
Single-layer image classifier built entirely from a LinearCNU memory layer.
Flattens the input image to a one-dimensional vector and passes it through a single
LinearCNU layer followed by a Sigmoid activation. No convolutional backbone is
used; the entire spatial structure of the image is handled by the memory-augmented
linear layer. This makes the model fast and lightweight at the cost of spatial
invariance.
The flat input size is computed as in_res * in_res * in_channels. Input transforms
are selected by transforms_factory in the same way as CNN. The processor I/O
is configured by get_proc_inputs_and_proc_outputs_for_image_classification.
Initialize a SingleLayerCNU classifier with a flat LinearCNU head.
Builds a sequential model (Flatten -> LinearCNU -> Sigmoid), generates input
transforms from transforms_factory, and delegates to
ModuleWrapper.__init__ with stream descriptors from
get_proc_inputs_and_proc_outputs_for_image_classification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_dim
|
int
|
Dimensionality of the output feature vector (number of sigmoid units). |
required |
cnu_memories
|
int
|
Number of key-value memory slots in the |
required |
in_channels
|
int
|
Number of input image channels (3 for RGB, 1 for grayscale). Defaults to 3. |
3
|
in_res
|
int
|
Spatial resolution (height and width) of the input image in pixels.
The flat input dimension is |
32
|
delta
|
int
|
Delta hyperparameter passed to |
1
|
scramble
|
bool
|
If |
False
|
*args
|
Additional positional arguments forwarded to |
()
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Source code in unaiverse/modules/networks.py
CNNMNIST
¶
Bases: CNN
CNN pre-configured for grayscale MNIST images (28x28 pixels, 1 channel).
A thin convenience subclass that calls CNN.__init__ with in_channels=1 and
in_res=28 forced in kwargs. After the parent is initialized, each input
stream's per-property transform is overridden with the "gray_mnist" preset from
transforms_factory, which applies the standard MNIST normalization.
Source code in unaiverse/modules/networks.py
SingleLayerCNUMNIST
¶
ResNet
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
ResNetCNU
¶
ResNetCNU(d_dim: int, cnu_memories: int, delta: int = 1, scramble: bool = False, freeze_backbone: bool = True, *args, **kwargs)
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
ViT
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
DenseNet
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
EfficientNet
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
FasterRCNN
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
labels
instance-attribute
¶
labels: list[str] = ['__background__', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'N/A', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'N/A', 'backpack', 'umbrella', 'N/A', 'N/A', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'N/A', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'N/A', 'dining table', 'N/A', 'N/A', 'toilet', 'N/A', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'N/A', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']
TinyLLama
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
LLama
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
Phi
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
LangSegmentAnything
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
highlight_masks_on_image
staticmethod
¶
Source code in unaiverse/modules/networks.py
SmolVLM
¶
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
SiteRAG
¶
SiteRAG(site_url: str, site_folder: str = join('rag', 'downloaded_site'), db_folder: str = join('rag', 'chroma_db'), *args, **kwargs)
Bases: ModuleWrapper
Source code in unaiverse/modules/networks.py
1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 | |
embedder
instance-attribute
¶
embedder = SentenceTransformerEmbeddings(model_name='all-MiniLM-L6-v2', model_kwargs={'device': type})
crawl_website
¶
Source code in unaiverse/modules/networks.py
crawled_site_to_rag_knowledge_base
¶
Source code in unaiverse/modules/networks.py
FeatherlessAPI
¶
FeatherlessAPI(model: str | None = None, cost: int = 1, system_prompt: str = '', process_id: str | None = None, max_tokens: int = -1, temperature: float = -1.0, top_p: float = -1.0, top_k: int = -1, frequency_penalty: float | None = None, presence_penalty: float | None = None, repetition_penalty: float | None = None, min_p: float | None = None, sampler: dict | None = None, connect_timeout: float = 15.0, *args, **kwargs)
Bases: ModuleWrapper
Callable handle onto the shared Featherless gateway.
Typical usage::
api = FeatherlessAPI(model="some-model-id", cost=2)
text = api("write me a haiku") # Routed through the gateway
One instance per logical caller. The model and unit cost are fixed at construction, so the callable takes only the prompt string. Construction bootstraps the shared server if needed (self-spawning, race-safe) and opens this caller's persistent registration socket: the liveness token whose lifetime equals this object's interest in the gateway. Closing the instance (or letting the process die) releases it; when the last instance goes away the server shuts itself down.
The whole client lifecycle (bootstrap, registration, request round-trip) is self-contained here; callers never touch the server internals.
Create a FeatherlessAPI handle and connect it to the shared gateway.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str | None
|
The model identifier used for every call (None lets the server fall back to its MODEL_ID default). |
None
|
cost
|
int
|
The unit cost charged for every call (one of VALID_COSTS) (Default: 1). |
1
|
system_prompt
|
str
|
The system prompt prepended to every call ("" means no system prompt) (Default: ""). |
''
|
process_id
|
str | None
|
Identifier used for round-robin fairness; defaults to this process's PID. |
None
|
max_tokens
|
int
|
Maximum number of tokens to generate per call (-1 means no limit) (Default: -1). |
-1
|
temperature
|
float
|
Sampling temperature for every call (negative lets the API use its default) (Default: -1.). |
-1.0
|
top_p
|
float
|
Nucleus-sampling probability (negative lets the API use its default) (Default: -1.). |
-1.0
|
top_k
|
int
|
Top-k sampling cutoff (negative lets the API use its default) (Default: -1). |
-1
|
frequency_penalty
|
float | None
|
Frequency penalty (None lets the API use its default) (Default: None). |
None
|
presence_penalty
|
float | None
|
Presence penalty (None lets the API use its default) (Default: None). |
None
|
repetition_penalty
|
float | None
|
Repetition penalty, a vLLM/Featherless extension (None uses the default) (Default None). |
None
|
min_p
|
float | None
|
Minimum-probability cutoff, a vLLM/Featherless extension (None uses the default) (Default: None). |
None
|
sampler
|
dict | None
|
Extra sampler params merged last (its keys win); use it for any knob not covered above. |
None
|
connect_timeout
|
float
|
Maximum seconds to wait for the gateway server to come up (Default: 15.0). |
15.0
|
Source code in unaiverse/modules/networks.py
2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 | |
close
¶
Close both the request and the persistent registration socket, releasing this caller's interest.