NLP

How transformer generate texts

Basic math: auto-regressive language generation Decoding methods Greedy Search Definition Example Feature Beam Search Definition examplle Feature Code Some function features of beam search Why beam search might not be good in open-ended generation Sampling issue 1 with sampling: coherent Top-K Sampling An example Good and Bad Top-p (nucleus) sampling — revised top-k sampling an example Top-K + Top-P Basic math: auto-regressive language generation auto-regressive language generation is based on the assumption that the probability distribution of a word sequence can be decomposed into the product of conditional next word distributions: