•

How to convert a positionally encoded predicted embedding from a decoder to its matching token?

When training a transformer on positionally encoded embeddings, should the tgt output embeddings also be positionally encoded? If so, wouldn't the predicted/decoded embeddings also be positionally encoded?

0 comments