unaiverse.modules.cnu.layers
What this module does 🔴
Defines CNU-backed neural layers (a linear layer and a 2D convolution) whose weights are generated on the fly by an underlying CNUs memory module rather than stored as static parameters.
layers
¶
█████ █████ ██████ █████ █████ █████ █████ ██████████ ███████████ █████████ ██████████
░░███ ░░███ ░░██████ ░░███ ░░███ ░░███ ░░███ ░░███░░░░░█░░███░░░░░███ ███░░░░░███░░███░░░░░█
░███ ░███ ░███░███ ░███ ██████ ░███ ░███ ░███ ░███ █ ░ ░███ ░███ ░███ ░░░ ░███ █ ░
░███ ░███ ░███░░███░███ ░░░░░███ ░███ ░███ ░███ ░██████ ░██████████ ░░█████████ ░██████
░███ ░███ ░███ ░░██████ ███████ ░███ ░░███ ███ ░███░░█ ░███░░░░░███ ░░░░░░░░███ ░███░░█
░███ ░███ ░███ ░░█████ ███░░███ ░███ ░░░█████░ ░███ ░ █ ░███ ░███ ███ ░███ ░███ ░ █
░░████████ █████ ░░█████░░████████ █████ ░░███ ██████████ █████ █████░░█████████ ██████████
░░░░░░░░ ░░░░░ ░░░░░ ░░░░░░░░ ░░░░░ ░░░ ░░░░░░░░░░ ░░░░░ ░░░░░ ░░░░░░░░░ ░░░░░░░░░░
A Collectionless AI Project (https://collectionless.ai)
Registration/Login: https://unaiverse.io
Code Repositories: https://github.com/collectionlessai/
Main Developers: Stefano Melacci (Project Leader), Christian Di Maio, Tommaso Guidi
LinearCNU
¶
LinearCNU(in_features, out_features, bias=True, device=None, shared_keys=True, key_mem_units=2, psi_fn='identity', key_size=None, **kwargs)
Bases: CNUs
A CNU-based drop-in replacement for torch.nn.Linear.
LinearCNU wraps a bank of Contextual Neural Units (CNUs) so that the
output of a fully-connected layer is computed by first retrieving the most
relevant memory slots from the CNU bank and then blending them into a set of
per-sample weight matrices. The resulting weights are applied to the input
with a batched matrix-vector product, producing sample-specific linear
projections rather than a fixed shared weight matrix.
Two key-sharing modes are supported:
- Shared keys (
shared_keys=True, default): a single CNU with concatenated memory units stores allout_features * (in_features + 1)parameters. This is memory-efficient and faster but all output neurons share the same key-matching logic. - Independent keys (
shared_keys=False): each of theout_featuresneurons has its own CNU instance with its own keys and memory units. This gives more expressive power at the cost of more parameters.
Parameters that control the underlying CNU dynamics (delta,
gamma_alpha, tau_alpha, upd_m, upd_k, etc.) are forwarded
transparently through **kwargs to the CNUs constructor. The
convenience arguments q, d, m, and u are reserved and must
not appear in **kwargs because they are derived automatically.
Attributes:
| Name | Type | Description |
|---|---|---|
in_features |
Number of input features, stored after construction. |
|
out_features |
Number of output features, stored after construction. |
|
bias |
|
|
shared_keys |
Whether all output neurons share the same key bank. |
Examples:
>>> import torch
>>> from unaiverse.modules.cnu.layers import LinearCNU
>>>
>>> # Drop-in replacement for nn.Linear(128, 64)
>>> layer = LinearCNU(in_features=128, out_features=64, key_mem_units=4)
>>> x = torch.randn(32, 128)
>>> y = layer(x)
>>> y.shape
torch.Size([32, 64])
>>>
>>> # Independent keys, no bias, explicit key size
>>> layer2 = LinearCNU(128, 64, bias=False, shared_keys=False, key_size=32)
>>> layer2(x).shape
torch.Size([32, 64])
Initialize a CNU-based linear layer with the given dimensions and CNU settings.
The constructor validates that the reserved CNU arguments (q, d, m,
u) are not present in **kwargs, then derives their values automatically
from in_features, out_features, and key_mem_units. It delegates the
actual parameter creation to the CNUs parent constructor, moves the module to
device when specified, and sets self.bias to None when bias=False
so that the forward pass can branch on a simple truthiness check.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
in_features
|
Number of input features (the size of each input sample). |
required | |
out_features
|
Number of output features (the size of each output sample). |
required | |
bias
|
If |
True
|
|
device
|
Target device for the module's parameters (e.g. |
None
|
|
shared_keys
|
If |
True
|
|
key_mem_units
|
Number of keys and memory units in each CNU (the |
2
|
|
psi_fn
|
Name of the function used to project the input onto the key space.
|
'identity'
|
|
key_size
|
Dimensionality of each key vector. If |
None
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
If any of the reserved kwargs |
Source code in unaiverse/modules/cnu/layers.py
forward
¶
Compute the CNU-based linear projection of the input.
The CNU bank first retrieves and blends memory units to produce a batch of
sample-specific weight tensors W of shape
[batch, out_features, in_features (+ 1 if bias)]. The tensor is split into
weights and (optional) biases, and the projection is computed as a batched
matrix-vector product identical to torch.nn.Linear - except that the
effective weight matrix is different for each sample in the batch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Input tensor of shape |
required |
Returns:
| Type | Description |
|---|---|
|
Output tensor of shape |
Source code in unaiverse/modules/cnu/layers.py
reset_parameters
¶
Reset the layer's keys and memory units using the PyTorch nn.Linear convention.
Memory units are initialized to replicate the standard torch.nn.Linear
initialization: weights are drawn with Kaiming uniform (fan-in mode, matching
math.sqrt(5) as the non-linearity slope), and biases are initialized
uniformly in [-1/sqrt(in_features), 1/sqrt(in_features)]. Each memory unit
stores a flattened concatenation of the corresponding weight matrix and bias
vector.
reset_memories is temporarily set to False before calling the parent's
reset_parameters so that the parent does not overwrite the memories with its
own default initialization scheme; this method then fills the memories directly
with the PyTorch-compatible values.
Note
Keys are reset by the parent CNUs.reset_parameters call. Only the memory
tensors M are overridden here.
Source code in unaiverse/modules/cnu/layers.py
Conv2d
¶
Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, padding_mode='zeros', dilation=1, groups=1, bias=True, device=None, shared_keys=True, key_mem_units=2, psi_fn='reduce2d', key_size=None, **kwargs)
Bases: CNUs
A CNU-based drop-in replacement for torch.nn.Conv2d.
Conv2d replaces the fixed convolutional filter bank of a standard 2-D
convolution with a bank of Contextual Neural Units. For each sample in the
batch the CNU bank retrieves and blends memory units to produce a
sample-specific set of convolutional filters. The resulting filters are
applied to the input using a grouped convolution so that every sample in
the batch gets its own distinct filter set, while the overall computation
remains vectorised on GPU.
The constructor interface mirrors torch.nn.Conv2d as closely as
possible, with the following additional parameters for CNU control:
shared_keys, key_mem_units, psi_fn, and key_size. All
other CNU hyperparameters (delta, gamma_alpha, upd_m,
upd_k, scramble, etc.) can be supplied through **kwargs.
The default psi_fn='reduce2d' spatially downsamples the input feature
map to a key-sized representation before comparing it against stored keys.
This is appropriate for image inputs; use 'identity' or another mode
when the input has a non-spatial structure.
Two key-sharing modes are supported (see LinearCNU for the full
description). The default is shared_keys=True.
Attributes:
| Name | Type | Description |
|---|---|---|
in_channels |
Number of input channels. |
|
out_channels |
Number of output channels (filters). |
|
kernel_size |
Kernel size as a |
|
stride |
Stride as a |
|
padding |
Padding as an integer, string, or |
|
padding_mode |
One of |
|
dilation |
Dilation as a |
|
groups |
Number of blocked connections from input channels to output channels. |
|
bias |
|
|
in_features |
Number of scalar values in a single receptive field,
equal to |
Examples:
>>> import torch
>>> from unaiverse.modules.cnu.layers import Conv2d
>>>
>>> # Drop-in replacement for nn.Conv2d(3, 16, kernel_size=3, padding=1)
>>> layer = Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
>>> x = torch.randn(8, 3, 32, 32)
>>> y = layer(x)
>>> y.shape
torch.Size([8, 16, 32, 32])
>>>
>>> # With strided convolution and independent keys
>>> layer2 = Conv2d(3, 16, kernel_size=5, stride=2, shared_keys=False)
>>> layer2(x).shape
torch.Size([8, 16, 14, 14])
Initialize a CNU-based 2-D convolutional layer.
Scalar arguments kernel_size, stride, and dilation are normalised to
(H, W) tuples. Padding is pre-computed into a reversed repeated form that is
compatible with F.pad for non-zero padding modes. The reserved CNU arguments
q, d, m, and u are derived automatically and must not appear in
**kwargs. After the CNUs parent is constructed the module is optionally
moved to device.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
in_channels
|
Number of channels in the input image. |
required | |
out_channels
|
Number of channels produced by the convolution. |
required | |
kernel_size
|
Size of the convolving kernel. An |
required | |
stride
|
Stride of the convolution. Defaults to |
1
|
|
padding
|
Zero-padding added to both sides of the input. Accepts an |
0
|
|
padding_mode
|
Padding strategy. Must be one of |
'zeros'
|
|
dilation
|
Spacing between kernel elements. Defaults to |
1
|
|
groups
|
Number of blocked connections from |
1
|
|
bias
|
If |
True
|
|
device
|
Target device for the module's parameters. If |
None
|
|
shared_keys
|
If |
True
|
|
key_mem_units
|
Number of keys and memory units per CNU (the |
2
|
|
psi_fn
|
Name of the function used to project the input feature map onto the
key space. |
'reduce2d'
|
|
key_size
|
Total dimensionality of each key vector. For |
None
|
|
**kwargs
|
Additional keyword arguments forwarded to |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
AssertionError
|
If any of the reserved kwargs |
Source code in unaiverse/modules/cnu/layers.py
302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 | |
kernel_size
instance-attribute
¶
dilation
instance-attribute
¶
forward
¶
Compute the CNU-based 2-D convolution of the input.
For each sample in the batch the CNU bank retrieves and blends memory units to
produce a set of sample-specific convolutional filters. The filters are
reshaped into a standard [out_channels * batch, in_channels_per_group,
kernel_H, kernel_W] filter tensor, and a single grouped F.conv2d call
applies them to all samples simultaneously by stacking all images along the
channel dimension and using groups=batch * self.groups.
Non-zero padding modes (reflect, replicate, circular) are handled by
an explicit F.pad call before the convolution; 'zeros' padding is passed
directly to F.conv2d.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Input tensor of shape |
required |
Returns:
| Type | Description |
|---|---|
|
Output tensor of shape |
|
|
|
|
|
are determined by the kernel size, stride, padding, and dilation according |
|
|
to the standard |