lncrnapy.modules

BERT

Contains base architectures (without output layers) of different neural network designs.

class lncrnapy.modules.bert.BERT(vocab_size, d_model=768, N=12, d_ff=None, h=None, dropout=0.1)

BERT base model, transformer encoder to be used for various tasks

References

Transformer: Vaswani et al. (2017) https://doi.org/10.48550/arXiv.1706.03762 Code: Huang et al. (2022) https://nlp.seas.harvard.edu/annotated-transformer BERT: Devlin et al. (2019) https://doi.org/10.48550/arXiv.1810.04805 MycoAI: Romeijn et al. (2024) https://doi.org/10.1111/1755-0998.14006

forward(src)

Given a source, retrieve encoded representation

freeze()

Freezes all weights of BERT model.

unfreeze()

Unfreezes all weights of BERT model.

class lncrnapy.modules.bert.CSEBERT(n_kernels, kernel_size=9, d_model=768, N=12, d_ff=None, h=None, dropout=0.1, input_linear=None, input_relu=True, n_hidden_kernels=0)

BERT variant that takes learnt convolutional sequence encodings (instead of tokens) as input. Based on vision transformer.

References

ViT: Dosovitskiy et al. (2020) https://doi.org/10.48550/arXiv.2010.11929

forward(src)

Given a source, retrieve encoded representation

freeze()

Freezes all weights of CSEBERT model.

freeze_kernels()

Freezes all CSE weights of CSEBERT model.

unfreeze()

Unfreezes all weights of CSEBERT model.

class lncrnapy.modules.bert.Decoder(d_model, d_ff, h, N, dropout, self_attention)

N layers of consisting of (masked) (self-)attention and FF sublayers, gradually transforms encoder’s output and output embedding into decoding

forward(x, m, src_mask, tgt_mask)

Pass the input (and mask) through each layer in turn.

class lncrnapy.modules.bert.Encoder(d_model, d_ff, h, N, dropout)

N layers of consisting of self-attention and feed forward sublayers, gradually transforms input into encoded representation.

forward(x, mask)

Pass the input (and mask) through each layer in turn.

class lncrnapy.modules.bert.FeedForward(d_model, d_ff, dropout)

Simple feed forward network (with dropout applied to mid layer)

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class lncrnapy.modules.bert.MultiHeadAttention(h, d_model, dropout)

‘Performs scaled dot product attention on h uniquely learned linear projections (allowing model to attend to info from different subspaces)

forward(query, key, value, mask=None)

Implements Figure 2 from ‘Attention Is All You Need’

class lncrnapy.modules.bert.PositionalEncoding(d_model, dropout, max_len=5000)

Adds positional information to an inputted embedding.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class lncrnapy.modules.bert.ResidualConnection(size, dropout)

Employs a normalized residual connection followed by dropout

forward(x, layer)

Adds layer(x) to x and applies normalization/dropout

lncrnapy.modules.bert.attention(query, key, value, mask=None, dropout=None)

Compute ‘Scaled Dot Product Attention’

CNN

CNN architectures that can possibly be used as base architectures in lncrnapy.

class lncrnapy.modules.cnn.CSEResNet(n_kernels, kernel_size, layers)

Like ResNet, but initial layers correspond to CSE.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class lncrnapy.modules.cnn.MycoAICNN(kernel=5, conv_layers=[5, 10], in_channels=1, pool_size=2, batch_normalization=True)

A simple CNN architecture with conv, batchnorm and maxpool layers, as used by MycoAI-CNN.

References

MycoAI: Romeijn et al. (2024) https://doi.org/10.1111/1755-0998.14006

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class lncrnapy.modules.cnn.ResNet(layers, in_channels=4)

Adapted from: https://blog.paperspace.com/writing-resnet-from-scratch-in-pytorch/

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class lncrnapy.modules.cnn.ResidualBlock(in_channels, out_channels, stride=1, downsample=None)

Adapted from: https://blog.paperspace.com/writing-resnet-from-scratch-in-pytorch/

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Convolutional Sequence Encoding

Contains modules related to convolutional sequence encoding, which uses a simple 1D convolutional neural network to extract kernels from input sequences. This is similar to how Vision Transformers (ViTs) work.

References

ViT: Dosovitskiy et al. (2020) https://doi.org/10.48550/arXiv.2010.11929

class lncrnapy.modules.conv_seq_encoding.ConvSeqEmbedding(n_kernels, d_model, kernel_size=10, input_linear=True, input_relu=True, n_hidden_kernels=0)

Projects convolutional sequence encoding into space of pre-specified dimensionality.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class lncrnapy.modules.conv_seq_encoding.ConvSeqEncoding(n_kernels, kernel_size=10, input_relu=True, n_hidden_kernels=0)

Implementation for convolutional sequence encoding using a small 1D CNN.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

visualize(kernel_idx, filepath=None)

Visualizes a certain kernel, indicated by kernel_idx. In case of hidden kernels, will visualize the first kernel layer (= hidden).

Wrappers

Contains wrapper classes that enhance a base architecture (which can be any PyTorch module) with additional requirements for various (pre-)training tasks from lncrnapy.

class lncrnapy.modules.wrappers.Classifier(*args, **kwargs)

Wrapper class that uses a base architecture to perform binary classification.

forward(X, return_logits=True)

A forward pass through the neural network.

predict(data, inplace=False, return_logits=False)

Calls forward in batch-wise fashion for all rows in data.

Parameters:
  • data (lncrnapy.data.Data) – Data object with tensor_features attribute.

  • **kwargs – Any keyword argument accepted by the model’s forward method.

class lncrnapy.modules.wrappers.MaskedConvModel(*args, **kwargs)

Wrapper class for model that performs Masked Language Modeling with cse-encoded sequences as input.

forward(X)

A forward pass through the neural network.

class lncrnapy.modules.wrappers.MaskedTokenModel(*args, **kwargs)

Wrapper class for model that performs Masked Language Modeling with tokenized sequences as input.

forward(X)

A forward pass through the neural network.

class lncrnapy.modules.wrappers.Regressor(*args, **kwargs)

Wrapper class for model that performs linear regression on the base architecture embedding.

forward(X)

A forward pass through the neural network.

class lncrnapy.modules.wrappers.WrapperBase(*args, **kwargs)

Base class for all wrapper modules in lncrnapy.

`base_arch`

PyTorch module to be used as base architecture of the classifier.

Type:

torch.nn.Module

`pred_batch_size`

Batch size used by the predict method.

Type:

int

`data_columns`

Data column name for outcome of predict method.

Type:

list | str

`latent_space_columns`

Data column name for latent space columns (only defined after calling latent_space method.)

Type:

list

forward(X)

A forward pass through the neural network.

latent_space(data, inplace=False, pooling=None, dim_red=TSNE())

Calculates latent representation for all rows in data.

Parameters:
  • data (lncrnapy.data.Data) – Data object for which latent space should be calculated.

  • inplace (bool) – If True, adds latent space as feature columns to data.

  • pooling (['CLS', 'max', 'mean', None]) – How to aggregate token embeddings (for BERT architectures). * ‘CLS’: use only CLS token. * ‘max’: max pooling over (non-padding) token embeddings. * ‘mean’: mean pooling over (non-padding) token embeddings. * None (default): no pooling, e.g. for CNN base architectures.

  • dim_red (sklearn | NoneType) – Dimensionality reduction algorithm from sklearn to use.

predict(data, inplace=False, **kwargs)

Calls forward in batch-wise fashion for all rows in data.

Parameters:
  • data (lncrnapy.data.Data) – Data object with tensor_features attribute.

  • **kwargs – Any keyword argument accepted by the model’s forward method.