A NEW SPRING FOR STATISTICAL METHODS: LARGE LANGUAGE MODELS (LLMS)
DOI:
https://doi.org/10.17740/eas.stat.2025-V25-06Keywords:
Large Language Models, Statistical Learning, Bayesian Inference, Transformer, EM Algorithm, PCA, SVDAbstract
Large Language Models (LLMs) are the cornerstone of modern AI systems capable of humanlike reasoning, language understanding, and text generation. Their success relies not only on deep learning architectures but also on a comprehensive statistical foundation. This article provides an extensive examination of statistical techniques underlying LLMs, including probability theory, statistical learning theory, Bayesian inference, Markov chains, the Expectation–Maximization algorithm (EM), dimensionality reduction (PCA, SVD), probabilistic graphical models, variational inference, and sampling methods such as MCMC. It further explains how these methods are integrated within the Transformer architecture and contemporary LLM training pipelines. Applications in natural language processing, healthcare, finance, and law are also explored in detail.