Perplexity of Language Model

Tags: #nlp #LLM #metric

Equation

$$\text{PPL}(X) = \exp \{- \frac{1}{t} \sum^{t}_{i} \log p_{\theta} (x_{i} | x_{ \lt i}) \}$$

Latex Code

                                 \text{PPL}(X) = \exp \{- \frac{1}{t} \sum^{t}_{i} \log p_{\theta} (x_{i} | x_{ \lt i}) \}
                            

Have Fun

Let's Vote for the Most Difficult Equation!

Introduction

$$ X $$: denotes the tokenized sequence of words with sequence length t, $$ X=(x_{0}, x_{1}, ..., x_{t}) $$
$$ \text{PPL}(X) $$: denotes the perplexity of a fixed length sequence of words.
$$ p_{\theta} (x_{i} | x_{ \lt i}) $$ : denotes the probability of language model calculating the next token $$x_{i}$$ given previous sequence of tokens preceding the i-th token $$ x_{ \lt i} $$.

References

Perplexity of fixed-length models
Wikipedia: Perplexity

Discussion

Comment to Make Wishes Come True

Leave your wishes (e.g. Passing Exams) in the comments and earn as many upvotes as possible to make your wishes come true