existLarge ModelTime, Tsinghua University, CUHK, HKUST, UIC, and Beijing University of Posts and Telecommunications jointly published an articleText WatermarkThis paper reviews and comprehensively explores the integration of large models and text watermarking technology.
This review first reviews the origin of text watermarking, and then focuses on three key directions in the era of large models: applying existing text watermarking algorithms to large models, large models assisting the design of text watermarking algorithms, and directly embedding watermarks into large models. It especially emphasizes the role of text watermarking technology in solving the problem of information abuse in large language models.
Paper link: https://arxiv.org/abs/2312.07913
The review further elaborates on the challenges brought by the rapid generation of text by large models, and introduces how text watermarking technology can ensure the security of large models by embedding recognizable marks. Secondly, the key challenges of large model-assisted text watermarking algorithm design and new explorations of the dream linkage between large models and text watermarking technology, including the trend of watermarking embedded in large models, are discussed in detail.
After classifying and summarizing the existing text watermarking algorithms, the review introduces in detail how to evaluate text watermarking algorithms, including four aspects: success rate, text quality, robustness, and unforgeability. In addition, the optimization attempts of existing text watermarking algorithms from these perspectives are also summarized.
Finally, the review highlights the expanded application scenarios of text watermarking technology in the era of large models, including copyright protection, academic integrity, and fake news detection. Text watermarking maintains intellectual property rights in the digital age by embedding tags in text and datasets, while achieving copyright protection for large models by defending against extraction attacks.
In the academic field, text watermarking technology contributes to maintaining academic integrity by embedding implicit watermark features in machine-generated text. This technology has also been applied to fake news detection, highlighting its importance in addressing today's social issues.