Robust Distortion-free Watermarks for Language Models

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ม.ค. 2025

ความคิดเห็น • 6

  • @antoineroquentin2852
    @antoineroquentin2852 5 หลายเดือนก่อน

    Very well presented, cheers!

  • @wolpumba4099
    @wolpumba4099 6 หลายเดือนก่อน +3

    *Summary*
    *Problem:*
    - Detecting AI-generated text is challenging, especially with improving realism of language model outputs. *(**0:14**)*
    - Traditional watermarking techniques introduce distortion, impacting text quality. *(**2:40**)*
    *Proposed Solution:* A distortion-free statistical watermarking technique for language models. *(**2:54**)*
    *Key Features:*
    * *Leverages LLM generation process:* Watermark is embedded during text generation by controlling the random number generator (RNG) used by the decoder. *(**3:57**)*
    * *Secret Key:* Watermark is detectable only by those possessing the secret key used to seed the RNG. *(**9:31**)*
    * *Statistical Correlations:* Detection relies on testing for statistical correlations between the generated text and the watermark key. *(**9:31**)*
    * *Distortion-Free:* Watermarking process does not alter the distribution of generated text, ensuring output quality. *(**12:23**)*
    * *Robustness:* Remains detectable even after moderate text editing or paraphrasing. *(**15:06**)*
    *How it Works:*
    1. *RNG Sequence Generation:* A long sequence of random numbers is pre-generated using the secret key. *(**13:07**)*
    2. *Random Subsequence Selection:* A random subsequence of these numbers drives the text generation process. *(**13:07**)*
    3. *Watermark Detection:* *(**15:32**)*
    - Search for all possible alignments of the test text with the pre-generated RNG sequence. *(**15:43**)*
    - Calculate the optimal alignment using Levenshtein distance, accounting for potential edits. *(**16:04**)*
    - Test for statistical correlations between the aligned text and random numbers. *(**22:30**)*
    *Advantages:*
    * No trade-off between robustness and distortion. *(**41:00**)*
    * Provably undetectable without the secret key. *(**9:31**)*
    *Limitations:*
    * Detection complexity scales with RNG sequence length, which needs to be large. *(**41:00**)*
    * Less effective on low-entropy text with limited creative choices. *(**39:43**)*
    *Future Work:*
    * Resolve tension between robustness, distortion-freeness, and detection complexity. *(**52:03**)*
    * Design more powerful hypothesis tests and explore alternative decoder functions for stronger watermark correlations. *(**52:45**)*
    i used gemini 1.5 pro to summarize the transcript

    • @mikono2022
      @mikono2022 2 หลายเดือนก่อน

      i am the detector. plz give me secret random number sequence?

    • @wolpumba4099
      @wolpumba4099 2 หลายเดือนก่อน

      @@mikono2022 haha!

  • @MikkoRantalainen
    @MikkoRantalainen 4 หลายเดือนก่อน

    How about following watermark?
    (1) Generate a secret key, 256 bits
    (2) While LLM is generating next output token, compute HMAC-SHA-256 where secret is the secret key catenated with previous token and next output token until first bit of HMAC-SHA-256 is zero. This requires ability to recompute next token using next random value in LLM mechanism. On average this will require one extra computation per two tokens.
    Verification can simply tokenize the input and compute the hash for token combinations. Since there's only 50% probability to get the first hash bit correct (zero), you get 1/2^128 accuracy for the watermark for 128 token sequence. And any shorter sequences will have accuracy depending on the token count for that sequence.
    The verification process only needs the secret key and the tokenizer.

  • @tophertmg
    @tophertmg 6 หลายเดือนก่อน

    Sounds like a purposeful manipulation technique to program humans to think, and ultimately vote, the ‘correct’ way?