Skip to content

unaiverse.modules.cnu.layers

What this module does 🔴

Defines CNU-backed neural layers (a linear layer and a 2D convolution) whose weights are generated on the fly by an underlying CNUs memory module rather than stored as static parameters.

layers

█████ █████ ██████ █████ █████ █████ █████ ██████████ ███████████ █████████ ██████████ ░░███ ░░███ ░░██████ ░░███ ░░███ ░░███ ░░███ ░░███░░░░░█░░███░░░░░███ ███░░░░░███░░███░░░░░█ ░███ ░███ ░███░███ ░███ ██████ ░███ ░███ ░███ ░███ █ ░ ░███ ░███ ░███ ░░░ ░███ █ ░ ░███ ░███ ░███░░███░███ ░░░░░███ ░███ ░███ ░███ ░██████ ░██████████ ░░█████████ ░██████
░███ ░███ ░███ ░░██████ ███████ ░███ ░░███ ███ ░███░░█ ░███░░░░░███ ░░░░░░░░███ ░███░░█
░███ ░███ ░███ ░░█████ ███░░███ ░███ ░░░█████░ ░███ ░ █ ░███ ░███ ███ ░███ ░███ ░ █ ░░████████ █████ ░░█████░░████████ █████ ░░███ ██████████ █████ █████░░█████████ ██████████ ░░░░░░░░ ░░░░░ ░░░░░ ░░░░░░░░ ░░░░░ ░░░ ░░░░░░░░░░ ░░░░░ ░░░░░ ░░░░░░░░░ ░░░░░░░░░░ A Collectionless AI Project (https://collectionless.ai) Registration/Login: https://unaiverse.io Code Repositories: https://github.com/collectionlessai/ Main Developers: Stefano Melacci (Project Leader), Christian Di Maio, Tommaso Guidi

LinearCNU

LinearCNU(in_features, out_features, bias=True, device=None, shared_keys=True, key_mem_units=2, psi_fn='identity', key_size=None, **kwargs)

Bases: CNUs

A CNU-based drop-in replacement for torch.nn.Linear.

LinearCNU wraps a bank of Contextual Neural Units (CNUs) so that the output of a fully-connected layer is computed by first retrieving the most relevant memory slots from the CNU bank and then blending them into a set of per-sample weight matrices. The resulting weights are applied to the input with a batched matrix-vector product, producing sample-specific linear projections rather than a fixed shared weight matrix.

Two key-sharing modes are supported:

  • Shared keys (shared_keys=True, default): a single CNU with concatenated memory units stores all out_features * (in_features + 1) parameters. This is memory-efficient and faster but all output neurons share the same key-matching logic.
  • Independent keys (shared_keys=False): each of the out_features neurons has its own CNU instance with its own keys and memory units. This gives more expressive power at the cost of more parameters.

Parameters that control the underlying CNU dynamics (delta, gamma_alpha, tau_alpha, upd_m, upd_k, etc.) are forwarded transparently through **kwargs to the CNUs constructor. The convenience arguments q, d, m, and u are reserved and must not appear in **kwargs because they are derived automatically.

Attributes:

Name Type Description
in_features

Number of input features, stored after construction.

out_features

Number of output features, stored after construction.

bias

True when a bias term is included; None when no bias is used (note that the attribute is overwritten from bool to None when bias=False).

shared_keys

Whether all output neurons share the same key bank.

Examples:

>>> import torch
>>> from unaiverse.modules.cnu.layers import LinearCNU
>>>
>>> # Drop-in replacement for nn.Linear(128, 64)
>>> layer = LinearCNU(in_features=128, out_features=64, key_mem_units=4)
>>> x = torch.randn(32, 128)
>>> y = layer(x)
>>> y.shape
torch.Size([32, 64])
>>>
>>> # Independent keys, no bias, explicit key size
>>> layer2 = LinearCNU(128, 64, bias=False, shared_keys=False, key_size=32)
>>> layer2(x).shape
torch.Size([32, 64])

Initialize a CNU-based linear layer with the given dimensions and CNU settings.

The constructor validates that the reserved CNU arguments (q, d, m, u) are not present in **kwargs, then derives their values automatically from in_features, out_features, and key_mem_units. It delegates the actual parameter creation to the CNUs parent constructor, moves the module to device when specified, and sets self.bias to None when bias=False so that the forward pass can branch on a simple truthiness check.

Parameters:

Name Type Description Default
in_features

Number of input features (the size of each input sample).

required
out_features

Number of output features (the size of each output sample).

required
bias

If True, a learnable bias is included in each memory unit. Defaults to True.

True
device

Target device for the module's parameters (e.g. "cuda:0"). If None, the default PyTorch device is used. Defaults to None.

None
shared_keys

If True, all output neurons share a single CNU with concatenated memory. If False, each neuron has independent keys and memory units. Defaults to True.

True
key_mem_units

Number of keys and memory units in each CNU (the m parameter of CNUs). Defaults to 2.

2
psi_fn

Name of the function used to project the input onto the key space. 'identity' passes the flat input vector directly. Defaults to 'identity'.

'identity'
key_size

Dimensionality of each key vector. If None, it defaults to in_features. Defaults to None.

None
**kwargs

Additional keyword arguments forwarded to CNUs.__init__ (e.g. delta, gamma_alpha, upd_m, upd_k, scramble). The keys 'q', 'd', 'm', and 'u' are reserved and must not appear here.

{}

Raises:

Type Description
AssertionError

If any of the reserved kwargs 'q', 'd', 'm', or 'u' are present in **kwargs.

Source code in unaiverse/modules/cnu/layers.py
def __init__(self, in_features, out_features, bias=True, device=None,
             shared_keys=True, key_mem_units=2, psi_fn='identity', key_size=None, **kwargs):
    """Initialize a CNU-based linear layer with the given dimensions and CNU settings.

    The constructor validates that the reserved CNU arguments (``q``, ``d``, ``m``,
    ``u``) are not present in ``**kwargs``, then derives their values automatically
    from ``in_features``, ``out_features``, and ``key_mem_units``. It delegates the
    actual parameter creation to the ``CNUs`` parent constructor, moves the module to
    ``device`` when specified, and sets ``self.bias`` to ``None`` when ``bias=False``
    so that the forward pass can branch on a simple truthiness check.

    Args:
        in_features: Number of input features (the size of each input sample).
        out_features: Number of output features (the size of each output sample).
        bias: If ``True``, a learnable bias is included in each memory unit.
            Defaults to ``True``.
        device: Target device for the module's parameters (e.g. ``"cuda:0"``).
            If ``None``, the default PyTorch device is used. Defaults to ``None``.
        shared_keys: If ``True``, all output neurons share a single CNU with
            concatenated memory. If ``False``, each neuron has independent keys and
            memory units. Defaults to ``True``.
        key_mem_units: Number of keys and memory units in each CNU (the ``m``
            parameter of ``CNUs``). Defaults to ``2``.
        psi_fn: Name of the function used to project the input onto the key space.
            ``'identity'`` passes the flat input vector directly. Defaults to
            ``'identity'``.
        key_size: Dimensionality of each key vector. If ``None``, it defaults to
            ``in_features``. Defaults to ``None``.
        **kwargs: Additional keyword arguments forwarded to ``CNUs.__init__`` (e.g.
            ``delta``, ``gamma_alpha``, ``upd_m``, ``upd_k``, ``scramble``). The
            keys ``'q'``, ``'d'``, ``'m'``, and ``'u'`` are reserved and must not
            appear here.

    Raises:
        AssertionError: If any of the reserved kwargs ``'q'``, ``'d'``, ``'m'``, or
            ``'u'`` are present in ``**kwargs``.
    """
    self.in_features = in_features
    self.out_features = out_features
    self.bias = bias
    self.shared_keys = shared_keys

    if kwargs is not None:
        assert 'q' not in kwargs, "The number of CNUs is automatically determined, do not set argument 'q'"
        assert 'd' not in kwargs, "The size of each key can be specified with argument 'key_size', " \
                                  "do not set argument 'd'"
        assert 'm' not in kwargs, "The number of keys and memory units can be specified with argument " \
                                  "'key_mem_units', do not set argument 'm'"
        assert 'u' not in kwargs, "Size of each memory unit is automatically determined, do not set argument 'u'"

    # Number of keys/memory units
    kwargs['m'] = key_mem_units

    # Size of each key
    kwargs['d'] = in_features if key_size is None else key_size

    # Function used to compare input against keys
    kwargs['psi_fn'] = psi_fn

    if not shared_keys:

        # Each neuron is an independent cnu, with its own keys and its own memory units
        kwargs['q'] = self.out_features
        kwargs['u'] = self.in_features + (1 if self.bias else 0)
    else:

        # All the CNUs of the layer share the same keys, thus their memory units are concatenated
        kwargs['q'] = 1
        kwargs['u'] = self.out_features * (self.in_features + (1 if self.bias else 0))

    # Creating neurons
    super(LinearCNU, self).__init__(**kwargs)

    # Switching device
    if device is not None:
        self.to(device)

    # Clearing
    if not self.bias:
        self.bias = None

in_features instance-attribute

in_features = in_features

out_features instance-attribute

out_features = out_features

bias instance-attribute

bias = bias

shared_keys instance-attribute

shared_keys = shared_keys

forward

forward(x)

Compute the CNU-based linear projection of the input.

The CNU bank first retrieves and blends memory units to produce a batch of sample-specific weight tensors W of shape [batch, out_features, in_features (+ 1 if bias)]. The tensor is split into weights and (optional) biases, and the projection is computed as a batched matrix-vector product identical to torch.nn.Linear - except that the effective weight matrix is different for each sample in the batch.

Parameters:

Name Type Description Default
x

Input tensor of shape [batch, in_features].

required

Returns:

Type Description

Output tensor of shape [batch, out_features].

Source code in unaiverse/modules/cnu/layers.py
def forward(self, x):
    """Compute the CNU-based linear projection of the input.

    The CNU bank first retrieves and blends memory units to produce a batch of
    sample-specific weight tensors ``W`` of shape
    ``[batch, out_features, in_features (+ 1 if bias)]``. The tensor is split into
    weights and (optional) biases, and the projection is computed as a batched
    matrix-vector product identical to ``torch.nn.Linear`` - except that the
    effective weight matrix is different for each sample in the batch.

    Args:
        x: Input tensor of shape ``[batch, in_features]``.

    Returns:
        Output tensor of shape ``[batch, out_features]``.
    """
    # Getting weights
    W = self.compute_weights(x)

    # Ensuring the shape is right (needed when neurons share the same keys)
    W = W.reshape((x.shape[0], self.out_features, -1))  # [b,q,1] => [b, out_features,(in_features + 1-if-bias)]

    # Splitting into weights and biases
    if self.bias:
        weights = W[:, :, :-1]  # [b,out_features,in_features]
        bias = W[:, :, -1]  # [b,out_features]
    else:
        weights = W  # [b,out_features,in_features]
        bias = None

    # Batched linear projection: matmul([b,out_features,in_features], [b,in_features,1]) = [b,out_features,1]
    # that we squeeze to [b,out_features]
    o = torch.matmul(weights, x.unsqueeze(2)).squeeze(2)  # [b,out_features]
    if bias is not None:
        o += bias
    return o

reset_parameters

reset_parameters()

Reset the layer's keys and memory units using the PyTorch nn.Linear convention.

Memory units are initialized to replicate the standard torch.nn.Linear initialization: weights are drawn with Kaiming uniform (fan-in mode, matching math.sqrt(5) as the non-linearity slope), and biases are initialized uniformly in [-1/sqrt(in_features), 1/sqrt(in_features)]. Each memory unit stores a flattened concatenation of the corresponding weight matrix and bias vector.

reset_memories is temporarily set to False before calling the parent's reset_parameters so that the parent does not overwrite the memories with its own default initialization scheme; this method then fills the memories directly with the PyTorch-compatible values.

Note

Keys are reset by the parent CNUs.reset_parameters call. Only the memory tensors M are overridden here.

Source code in unaiverse/modules/cnu/layers.py
def reset_parameters(self):
    """Reset the layer's keys and memory units using the PyTorch ``nn.Linear`` convention.

    Memory units are initialized to replicate the standard ``torch.nn.Linear``
    initialization: weights are drawn with Kaiming uniform (fan-in mode, matching
    ``math.sqrt(5)`` as the non-linearity slope), and biases are initialized
    uniformly in ``[-1/sqrt(in_features), 1/sqrt(in_features)]``. Each memory unit
    stores a flattened concatenation of the corresponding weight matrix and bias
    vector.

    ``reset_memories`` is temporarily set to ``False`` before calling the parent's
    ``reset_parameters`` so that the parent does not overwrite the memories with its
    own default initialization scheme; this method then fills the memories directly
    with the PyTorch-compatible values.

    Note:
        Keys are reset by the parent ``CNUs.reset_parameters`` call. Only the memory
        tensors ``M`` are overridden here.
    """
    self.reset_memories = False
    super().reset_parameters()

    # We ensure that memories M are initialized as Pytorch does for the classic linear layer
    q = self.M.shape[0]
    m = self.M.shape[1]
    self.M.data.zero_()  # Ensures we don't keep old values

    for j in range(q):
        for i in range(m):

            # Initialize weight and bias separately for each memory
            weight = torch.empty(self.out_features if self.shared_keys else 1, self.in_features)
            torch.nn.init.kaiming_uniform_(weight, a=math.sqrt(5))  # Computes fan in

            if self.bias:
                bias = torch.empty(self.out_features if self.shared_keys else 1)
                bound = 1 / math.sqrt(self.in_features)
                torch.nn.init.uniform_(bias, -bound, bound)
                weight_bias = torch.cat([weight, bias.unsqueeze(1)], dim=1)
            else:
                weight_bias = weight

            # Store the flattened weight_bias into self.M[i]
            self.M.data[j, i, :] = weight_bias.flatten()

Conv2d

Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, padding_mode='zeros', dilation=1, groups=1, bias=True, device=None, shared_keys=True, key_mem_units=2, psi_fn='reduce2d', key_size=None, **kwargs)

Bases: CNUs

A CNU-based drop-in replacement for torch.nn.Conv2d.

Conv2d replaces the fixed convolutional filter bank of a standard 2-D convolution with a bank of Contextual Neural Units. For each sample in the batch the CNU bank retrieves and blends memory units to produce a sample-specific set of convolutional filters. The resulting filters are applied to the input using a grouped convolution so that every sample in the batch gets its own distinct filter set, while the overall computation remains vectorised on GPU.

The constructor interface mirrors torch.nn.Conv2d as closely as possible, with the following additional parameters for CNU control: shared_keys, key_mem_units, psi_fn, and key_size. All other CNU hyperparameters (delta, gamma_alpha, upd_m, upd_k, scramble, etc.) can be supplied through **kwargs.

The default psi_fn='reduce2d' spatially downsamples the input feature map to a key-sized representation before comparing it against stored keys. This is appropriate for image inputs; use 'identity' or another mode when the input has a non-spatial structure.

Two key-sharing modes are supported (see LinearCNU for the full description). The default is shared_keys=True.

Attributes:

Name Type Description
in_channels

Number of input channels.

out_channels

Number of output channels (filters).

kernel_size

Kernel size as a (H, W) tuple, normalised from the constructor argument.

stride

Stride as a (H, W) tuple.

padding

Padding as an integer, string, or (H, W) tuple, depending on the constructor argument.

padding_mode

One of 'zeros', 'reflect', 'replicate', or 'circular'.

dilation

Dilation as a (H, W) tuple.

groups

Number of blocked connections from input channels to output channels.

bias

True when a learnable bias is included; False otherwise.

in_features

Number of scalar values in a single receptive field, equal to kernel_H * kernel_W * in_channels.

Examples:

>>> import torch
>>> from unaiverse.modules.cnu.layers import Conv2d
>>>
>>> # Drop-in replacement for nn.Conv2d(3, 16, kernel_size=3, padding=1)
>>> layer = Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
>>> x = torch.randn(8, 3, 32, 32)
>>> y = layer(x)
>>> y.shape
torch.Size([8, 16, 32, 32])
>>>
>>> # With strided convolution and independent keys
>>> layer2 = Conv2d(3, 16, kernel_size=5, stride=2, shared_keys=False)
>>> layer2(x).shape
torch.Size([8, 16, 14, 14])

Initialize a CNU-based 2-D convolutional layer.

Scalar arguments kernel_size, stride, and dilation are normalised to (H, W) tuples. Padding is pre-computed into a reversed repeated form that is compatible with F.pad for non-zero padding modes. The reserved CNU arguments q, d, m, and u are derived automatically and must not appear in **kwargs. After the CNUs parent is constructed the module is optionally moved to device.

Parameters:

Name Type Description Default
in_channels

Number of channels in the input image.

required
out_channels

Number of channels produced by the convolution.

required
kernel_size

Size of the convolving kernel. An int is expanded to a square kernel (kernel_size, kernel_size).

required
stride

Stride of the convolution. Defaults to 1.

1
padding

Zero-padding added to both sides of the input. Accepts an int, a (H, W) tuple, or the string 'same' for same-sized output. Defaults to 0.

0
padding_mode

Padding strategy. Must be one of 'zeros', 'reflect', 'replicate', or 'circular'. Defaults to 'zeros'.

'zeros'
dilation

Spacing between kernel elements. Defaults to 1.

1
groups

Number of blocked connections from in_channels to out_channels. Defaults to 1.

1
bias

If True, adds a learnable bias to the output. Defaults to True.

True
device

Target device for the module's parameters. If None, the default PyTorch device is used. Defaults to None.

None
shared_keys

If True, all output channels share a single CNU with concatenated memory. If False, each output channel has independent keys and memory units. Defaults to True.

True
key_mem_units

Number of keys and memory units per CNU (the m parameter of CNUs). Defaults to 2.

2
psi_fn

Name of the function used to project the input feature map onto the key space. 'reduce2d' spatially downsamples the map before comparing against keys. Defaults to 'reduce2d'.

'reduce2d'
key_size

Total dimensionality of each key vector. For 'reduce2d', this is a spatial size (in scalar units). If None, it defaults to 5 * 5 * in_channels. Accepts an int or an (H, W) tuple/list (which is converted to its product). Defaults to None.

None
**kwargs

Additional keyword arguments forwarded to CNUs.__init__ (e.g. delta, gamma_alpha, upd_m, upd_k, scramble). The keys 'q', 'd', 'm', and 'u' are reserved and must not appear here.

{}

Raises:

Type Description
ValueError

If padding_mode is not one of the four supported values.

AssertionError

If any of the reserved kwargs 'q', 'd', 'm', or 'u' are present in **kwargs.

Source code in unaiverse/modules/cnu/layers.py
def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, padding_mode='zeros',
             dilation=1, groups=1, bias=True, device=None,
             shared_keys=True, key_mem_units=2, psi_fn='reduce2d', key_size=None, **kwargs):
    """Initialize a CNU-based 2-D convolutional layer.

    Scalar arguments ``kernel_size``, ``stride``, and ``dilation`` are normalised to
    ``(H, W)`` tuples. Padding is pre-computed into a reversed repeated form that is
    compatible with ``F.pad`` for non-zero padding modes. The reserved CNU arguments
    ``q``, ``d``, ``m``, and ``u`` are derived automatically and must not appear in
    ``**kwargs``. After the ``CNUs`` parent is constructed the module is optionally
    moved to ``device``.

    Args:
        in_channels: Number of channels in the input image.
        out_channels: Number of channels produced by the convolution.
        kernel_size: Size of the convolving kernel. An ``int`` is expanded to a
            square kernel ``(kernel_size, kernel_size)``.
        stride: Stride of the convolution. Defaults to ``1``.
        padding: Zero-padding added to both sides of the input. Accepts an ``int``,
            a ``(H, W)`` tuple, or the string ``'same'`` for same-sized output.
            Defaults to ``0``.
        padding_mode: Padding strategy. Must be one of ``'zeros'``, ``'reflect'``,
            ``'replicate'``, or ``'circular'``. Defaults to ``'zeros'``.
        dilation: Spacing between kernel elements. Defaults to ``1``.
        groups: Number of blocked connections from ``in_channels`` to
            ``out_channels``. Defaults to ``1``.
        bias: If ``True``, adds a learnable bias to the output. Defaults to ``True``.
        device: Target device for the module's parameters. If ``None``, the default
            PyTorch device is used. Defaults to ``None``.
        shared_keys: If ``True``, all output channels share a single CNU with
            concatenated memory. If ``False``, each output channel has independent
            keys and memory units. Defaults to ``True``.
        key_mem_units: Number of keys and memory units per CNU (the ``m`` parameter
            of ``CNUs``). Defaults to ``2``.
        psi_fn: Name of the function used to project the input feature map onto the
            key space. ``'reduce2d'`` spatially downsamples the map before
            comparing against keys. Defaults to ``'reduce2d'``.
        key_size: Total dimensionality of each key vector. For ``'reduce2d'``, this
            is a spatial size (in scalar units). If ``None``, it defaults to
            ``5 * 5 * in_channels``. Accepts an ``int`` or an ``(H, W)``
            tuple/list (which is converted to its product). Defaults to ``None``.
        **kwargs: Additional keyword arguments forwarded to ``CNUs.__init__`` (e.g.
            ``delta``, ``gamma_alpha``, ``upd_m``, ``upd_k``, ``scramble``). The
            keys ``'q'``, ``'d'``, ``'m'``, and ``'u'`` are reserved and must not
            appear here.

    Raises:
        ValueError: If ``padding_mode`` is not one of the four supported values.
        AssertionError: If any of the reserved kwargs ``'q'``, ``'d'``, ``'m'``, or
            ``'u'`` are present in ``**kwargs``.
    """
    self.in_channels = in_channels
    self.out_channels = out_channels
    self.kernel_size = kernel_size if isinstance(kernel_size, Iterable) else (kernel_size, kernel_size)
    self.stride = stride if isinstance(stride, Iterable) else (stride, stride)
    self.padding = padding
    self.padding_mode = padding_mode
    self.dilation = dilation if isinstance(dilation, Iterable) else (dilation, dilation)
    self.groups = groups
    self.bias = bias
    self.in_features = math.prod(self.kernel_size) * self.in_channels

    valid_padding_modes = {'zeros', 'reflect', 'replicate', 'circular'}
    if padding_mode not in valid_padding_modes:
        raise ValueError("padding_mode must be one of {}, but got padding_mode='{}'".format(valid_padding_modes,
                                                                                            padding_mode))
    if isinstance(padding, str):
        self.__reversed_padding_repeated_twice = [0, 0] * len(self.kernel_size)
        if padding == 'same':
            for d, k, i in zip(self.dilation, self.kernel_size,
                               range(len(self.kernel_size) - 1, -1, -1)):
                total_padding = d * (k - 1)
                left_pad = total_padding // 2
                self.__reversed_padding_repeated_twice[2 * i] = left_pad
                self.__reversed_padding_repeated_twice[2 * i + 1] = (total_padding - left_pad)
    else:
        self.padding = padding if isinstance(padding, Iterable) else (padding, padding)
        self.__reversed_padding_repeated_twice = tuple(x for x in reversed(self.padding) for _ in range(2))

    if kwargs is not None:
        assert 'q' not in kwargs, "The number of CNUs is automatically determined, do not set argument 'q'"
        assert 'd' not in kwargs, "The size of each key can be specified with argument 'key_size', " \
                                  "do not set argument 'd'"
        assert 'm' not in kwargs, "The number of keys and memory units can be specified with argument " \
                                  "'key_mem_units', do not set argument 'm'"
        assert 'u' not in kwargs, "Size of each memory unit is automatically determined, do not set argument 'u'"

    # Number of keys/memory units
    kwargs['m'] = key_mem_units

    # Size of each key
    if key_size is not None:
        if isinstance(key_size, (tuple, list)):
            key_size = math.prod(key_size)
        kwargs['d'] = key_size
    else:
        kwargs['d'] = (5 * 5 * self.in_channels)

    # Function used to compare input against keys
    kwargs['psi_fn'] = psi_fn

    if not shared_keys:

        # Each neuron is an independent cnu, with its own keys and its own memory units
        kwargs['q'] = self.out_channels
        kwargs['u'] = self.in_features + (1 if self.bias else 0)
    else:

        # All the CNUs of the layer share the same keys, thus their memory units are concatenated
        kwargs['q'] = 1
        kwargs['u'] = self.out_channels * (self.in_features + (1 if self.bias else 0))

    # Creating neurons
    super(Conv2d, self).__init__(**kwargs)

    # Switching device
    if device is not None:
        self.to(device)

in_channels instance-attribute

in_channels = in_channels

out_channels instance-attribute

out_channels = out_channels

kernel_size instance-attribute

kernel_size = kernel_size if isinstance(kernel_size, Iterable) else (kernel_size, kernel_size)

stride instance-attribute

stride = stride if isinstance(stride, Iterable) else (stride, stride)

padding instance-attribute

padding = padding

padding_mode instance-attribute

padding_mode = padding_mode

dilation instance-attribute

dilation = dilation if isinstance(dilation, Iterable) else (dilation, dilation)

groups instance-attribute

groups = groups

bias instance-attribute

bias = bias

in_features instance-attribute

in_features = prod(kernel_size) * in_channels

forward

forward(x)

Compute the CNU-based 2-D convolution of the input.

For each sample in the batch the CNU bank retrieves and blends memory units to produce a set of sample-specific convolutional filters. The filters are reshaped into a standard [out_channels * batch, in_channels_per_group, kernel_H, kernel_W] filter tensor, and a single grouped F.conv2d call applies them to all samples simultaneously by stacking all images along the channel dimension and using groups=batch * self.groups.

Non-zero padding modes (reflect, replicate, circular) are handled by an explicit F.pad call before the convolution; 'zeros' padding is passed directly to F.conv2d.

Parameters:

Name Type Description Default
x

Input tensor of shape [batch, in_channels, H, W].

required

Returns:

Type Description

Output tensor of shape

[batch, out_channels, H_out, W_out], where H_out and W_out

are determined by the kernel size, stride, padding, and dilation according

to the standard torch.nn.Conv2d formula.

Source code in unaiverse/modules/cnu/layers.py
def forward(self, x):
    """Compute the CNU-based 2-D convolution of the input.

    For each sample in the batch the CNU bank retrieves and blends memory units to
    produce a set of sample-specific convolutional filters. The filters are
    reshaped into a standard ``[out_channels * batch, in_channels_per_group,
    kernel_H, kernel_W]`` filter tensor, and a single grouped ``F.conv2d`` call
    applies them to all samples simultaneously by stacking all images along the
    channel dimension and using ``groups=batch * self.groups``.

    Non-zero padding modes (``reflect``, ``replicate``, ``circular``) are handled by
    an explicit ``F.pad`` call before the convolution; ``'zeros'`` padding is passed
    directly to ``F.conv2d``.

    Args:
        x: Input tensor of shape ``[batch, in_channels, H, W]``.

    Returns:
        Output tensor of shape
        ``[batch, out_channels, H_out, W_out]``, where ``H_out`` and ``W_out``
        are determined by the kernel size, stride, padding, and dilation according
        to the standard ``torch.nn.Conv2d`` formula.
    """
    # Shortcuts
    b, c, h, w = x.shape

    # Getting weights
    W = self.compute_weights(x)

    # Ensuring the shape is right (needed when neurons share the same keys)
    W = W.reshape((b, self.out_channels, -1))  # [b,q,1] => [b,out_channels,(in_features + 1-if-bias)]

    # Splitting into weights and biases
    if self.bias:
        weights = W[:, :, :-1]  # [b,out_channels,in_features]
        bias = W[:, :, -1]  # [b,out_channels]
    else:
        weights = W  # [b,out_channels,in_features]
        bias = None

    # Creating tensor with convolutional filters
    kernels = self.__mat2filters(weights)

    # Stack all images along the channels
    x = x.view(1, b * c, h, w)

    # Convolution
    if self.padding_mode != 'zeros':
        x = F.conv2d(F.pad(x, self.__reversed_padding_repeated_twice, mode=self.padding_mode),
                     kernels, bias.flatten() if bias is not None else None, self.stride,
                     (0, 0), self.dilation, groups=(b * self.groups))
    else:
        x = F.conv2d(x, kernels, bias.flatten() if bias is not None else None, self.stride,
                     self.padding, self.dilation, groups=(b * self.groups))

    return x.view(b, self.out_channels, x.shape[2], x.shape[3])