Encoder Decoder from scratch

Post Reply
Katsutoshi
Posts: 9
Joined: Mon Dec 12, 2022 9:57 am

Encoder Decoder from scratch

Post by Katsutoshi »

Good day!! Thank you for your really meaningful lecture about Transformer!

I here tried to make English - Ukrainian translation model from scratch, referring to your demo code.
Then, I am now facing error as below;

Code: Select all

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-39-e0725ab1eb3a> in <module>
----> 1 train_losses, test_losses = train(
      2     transformer, criterion, optimizer, train_loader, valid_loader, epochs=15)

7 frames
<ipython-input-37-69d8976201a7> in train(model, criterion, optimizer, train_loader, valid_loader, epochs)
     34 
     35       # Forward pass
---> 36       outputs = model(enc_input, dec_input, enc_mask, dec_mask)
     37       loss = criterion(outputs.transpose(2, 1), targets)
     38 

/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1188         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1189                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1190             return forward_call(*input, **kwargs)
   1191         # Do not call functions when jit is used
   1192         full_backward_hooks, non_full_backward_hooks = [], []

<ipython-input-8-98825961b1ce> in forward(self, enc_input, dec_input, enc_mask, dec_mask)
      6 
      7   def forward(self, enc_input, dec_input, enc_mask, dec_mask):
----> 8     enc_output = self.encoder(enc_input, enc_mask)
      9     dec_output = self.decoder(enc_output, dec_input, enc_mask, dec_mask)
     10     return dec_output

/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1188         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1189                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1190             return forward_call(*input, **kwargs)
   1191         # Do not call functions when jit is used
   1192         full_backward_hooks, non_full_backward_hooks = [], []

<ipython-input-6-d0cbee872629> in forward(self, x, pad_mask)
     25 
     26   def forward(self, x, pad_mask=None):
---> 27     x = self.embedding(x)
     28     x = self.pos_encoding(x)
     29     for block in self.transformer_blocks:

/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1188         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1189                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1190             return forward_call(*input, **kwargs)
   1191         # Do not call functions when jit is used
   1192         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.8/dist-packages/torch/nn/modules/sparse.py in forward(self, input)
    158 
    159     def forward(self, input: Tensor) -> Tensor:
--> 160         return F.embedding(
    161             input, self.weight, self.padding_idx, self.max_norm,
    162             self.norm_type, self.scale_grad_by_freq, self.sparse)

/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   2208         # remove once script supports set_grad_enabled
   2209         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2210     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   2211 
   2212 

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
When I run train function (i.e. when train_loader runs and another batch run), sometimes error contents differ.
(Actually error becomes to happen still in embedding.)

Therefore I checked my code, caring about following points;
  • enc_input (input_ids) and enc_mask shape
    => It seems that it is okey with this point, because raw data from load_dataset is correctly converted through correct tokenizer
    (checkpoint = 'Helsinki-NLP/opus-mt-en-uk'), such as in previous section of translation, where trainer is used
  • dec_input
    => I have already changed the index No of <s> from demo code one to the other set in this code
As my guess from error message, this error is due to input shape because this error is happening in embedding or positional encoding.
However, I cannot find breakthrough information for this issue, therefore your advice would be highly appreciated.

Thank you for your cooperation.

Katsutoshi
Katsutoshi
Posts: 9
Joined: Mon Dec 12, 2022 9:57 am

Re: Encoder Decoder from scratch

Post by Katsutoshi »

Dear Lazy Programmer,

Good day!!

Could you provide any update with me regarding this question?

Best regards,

Katsutoshi
Post Reply

Return to “Data Science: Transformers for Natural Language Processing”