Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models

[ICLR'25 Spotlight]

Jingyang Zhang¹^*, Jingwei Sun¹^*, Eric Yeats¹, Yang Ouyang¹,
Martin Kuo¹, Jianyi Zhang¹, Hao Frank Yang^1,2, Hai Li¹

^*Equal Contribution
¹Duke University ²Johns Hopkins University

We propose a novel method for the detection of pre-training data of LLMs. This problem (see above figure left panel for an illustration) has been receiving growing attention recently, due to its profound implications to copyrighted content detection, privacy auditing, and evaluation data contamination.

Our method, named Min-K%++, is theoretically motivated by revisiting LLM's training objective (maximum likelihood estimation, MLE) through the lens of score matching. We show that the general MLE training under continuous distribution will tend to make training samples be (or locate near) the local maxima along each input dimension. Then, translating this insight into a practical method for LLMs, we propose to detect training data by inspecting whether the tokens in the input text form modes (or have relatively high probability) under the conditional categorical distribution modeled by LLMs.

Empirically, Min-K%++ achieves state-of-the-art performance on the WikiMIA benchmark, outperforming existing approaches by large margin (showcased by the above figure right panel). On the more challenging MIMIR benchmark, Min-K%++ is also the best among reference-free methods and performs on par with reference-based methods.

WikiMIA results

Detection AUROC (%) on WikiMIA_length32. **Min-K%++ achieves significantly improved results over Min-K% and other existing methods.** For more results, don't hesitate to check our paper.
Method	Mamba-1.4B	Pythia-6.9B	LLaMA-13B	LLaMA-30B	LLaMA-65B	Average
Loss	61.0	63.8	67.5	69.4	70.7	66.5
Ref	62.2	63.6	57.9	63.5	68.8	63.2
Lowercase	60.9	62.2	64.0	64.1	66.5	63.5
Zlib	61.9	64.3	67.8	69.8	71.1	67.0
Neighbor	64.1	65.8	65.8	67.6	69.6	66.6
Min-K%	63.2	66.3	68.0	70.1	71.3	67.8
Min-K%++	66.8	70.3	84.8	84.3	85.1	78.3

MIMIR results

Detection AUROC (%) on MIMIR averaged over 7 subdomains. The best result is **bolded**, with the runner-up underlined. **Min-K%++ achieves SOTA among reference-free methods and performs on par with the Ref method which requires an extra reference LLM.**
Method	Pythia-160M	Pythia-1.4B	Pythia-2.8B	Pythia-6.9B	Pythia-12B
Loss	52.1	53.1	53.5	54.4	54.9
Ref	52.2	54.6	55.6	57.4	58.7
Zlib	52.3	53.2	53.6	54.3	54.8
Neighbor	52.0	52.9	53.2	53.8	/
Min-K%	52.6	53.6	54.2	55.2	55.9
Min-K%++	52.4	54.1	55.3	57.0	58.7

BibTeX


@inproceedings{
    zhang2025mink,
    title={Min-K\%++: Improved Baseline for Pre-Training Data Detection from Large Language Models},
    author={Jingyang Zhang and Jingwei Sun and Eric Yeats and Yang Ouyang and Martin Kuo and Jianyi Zhang and Hao Frank Yang and Hai Li},
    booktitle={The Thirteenth International Conference on Learning Representations},
    year={2025},
    url={https://openreview.net/forum?id=ZGkfoufDaU}
}