src.jormungandr.embedder
Spatial and temporal positional embeddings for DETR-style models.
All embedders implement the Embedder protocol, exposing a
forward(shape, device, dtype, mask) interface so they can be swapped
without changing the calling code.
Classes:
-
DetrLearnedPositionEmbedding–This module learns positional embeddings up to a fixed maximum size.
-
DetrSinePositionEmbedding–This is a more standard version of the position embedding, very similar to the one used by the Attention is all you
-
TemporalSinePositionEmbedding–
DetrLearnedPositionEmbedding
DetrLearnedPositionEmbedding(embedding_dim=256)
Bases: Module, Embedder
This module learns positional embeddings up to a fixed maximum size.
Source code in src/jormungandr/embedder.py
110 111 112 113 | |
DetrSinePositionEmbedding
DetrSinePositionEmbedding(num_position_features: int = 128, temperature: int = 10000, normalize: bool = True, scale: float | None = None)
Bases: Module, Embedder
This is a more standard version of the position embedding, very similar to the one used by the Attention is all you need paper, generalized to work on images.
Methods:
-
forward–Args:
Source code in src/jormungandr/embedder.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
forward
forward(shape: Size, device: device | str, dtype: dtype, mask: Tensor | None = None) -> torch.Tensor
Parameters:
-
(shapeSize) –The shape of the feature maps for which to compute the position embedding, expected to be (batch_size, channels, height, width)
-
(devicedevice | str) –The device on which to create the position embedding
-
(dtypedtype) –The dtype of the position embedding
-
(maskTensor | None, default:None) –An optional mask tensor of shape (batch_size, height, width) where True values indicate masked positions. If None, no positions are masked.
Returns: A position embedding tensor of shape (batch_size, sequence_length, hidden_size) where sequence_length is height * width and hidden_size is num_position_features * 2 (for sine and cosine components)
Source code in src/jormungandr/embedder.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | |
TemporalSinePositionEmbedding
TemporalSinePositionEmbedding(num_position_features: int = 128, temperature: int = 10000, normalize: bool = True, scale: float | None = None)
Bases: Module, Embedder
Methods:
-
forward–Generate temporal sine position embeddings.
Source code in src/jormungandr/embedder.py
145 146 147 148 149 150 151 152 153 154 155 156 | |
forward
forward(shape: Size, device: device | str, dtype: dtype, delta_t: float = 1.0) -> torch.Tensor
Generate temporal sine position embeddings. Args: shape: The shape of the input tensor for which to compute the position embedding, expected to be (n_frames, sequence_length, model_dimension) device: The device on which to create the position embedding dtype: The dtype of the position embedding delta_t: The time interval between frames, used to compute the sine and cosine values. n_frames: The number of frames in the temporal sequence for which to compute the position embeddings. Returns: A position embedding tensor of shape (sequence_length * n_frames, num_position_features * 2) where num_position_features is the number of sine and cosine features for each temporal position. The first half of the features correspond to sine values and the second half correspond to cosine values.
PE(n_f, 2i) = sin(n_f * delta_t / (10000^(2i/d_model))) PE(n_f, 2i+1) = cos(n_f * delta_t / (10000^(2i/d_model)))
Source code in src/jormungandr/embedder.py
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |