A Multi Dimensional Online Contention Resolution Scheme

[1hr Talk] Intro to Large Language Models

Privacy Preserving ML with Fully Homomorphic Encryption

How to treat Acne💉

มายคราฟ, แต่ ไลค์ = หัวใจ!

ไทยพลิกแซงสิงคโปร์ 2-4! อาเซียนยกเป็นแมตช์สุดมันส์!! เหงียนชมดูไทยเล่นสนุกจริง!

Robust Distortion-free Watermarks for Language Models

Google TechTalks

มุมมอง 1 173

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 14 ม.ค. 2025

ความคิดเห็น • 6

@antoineroquentin2852 5 หลายเดือนก่อน
Very well presented, cheers!
@wolpumba4099 6 หลายเดือนก่อน ⁺³
*Summary*
*Problem:*
- Detecting AI-generated text is challenging, especially with improving realism of language model outputs. *(**0:14**)*
- Traditional watermarking techniques introduce distortion, impacting text quality. *(**2:40**)*
*Proposed Solution:* A distortion-free statistical watermarking technique for language models. *(**2:54**)*
*Key Features:*
* *Leverages LLM generation process:* Watermark is embedded during text generation by controlling the random number generator (RNG) used by the decoder. *(**3:57**)*
* *Secret Key:* Watermark is detectable only by those possessing the secret key used to seed the RNG. *(**9:31**)*
* *Statistical Correlations:* Detection relies on testing for statistical correlations between the generated text and the watermark key. *(**9:31**)*
* *Distortion-Free:* Watermarking process does not alter the distribution of generated text, ensuring output quality. *(**12:23**)*
* *Robustness:* Remains detectable even after moderate text editing or paraphrasing. *(**15:06**)*
*How it Works:*
1. *RNG Sequence Generation:* A long sequence of random numbers is pre-generated using the secret key. *(**13:07**)*
2. *Random Subsequence Selection:* A random subsequence of these numbers drives the text generation process. *(**13:07**)*
3. *Watermark Detection:* *(**15:32**)*
- Search for all possible alignments of the test text with the pre-generated RNG sequence. *(**15:43**)*
- Calculate the optimal alignment using Levenshtein distance, accounting for potential edits. *(**16:04**)*
- Test for statistical correlations between the aligned text and random numbers. *(**22:30**)*
*Advantages:*
* No trade-off between robustness and distortion. *(**41:00**)*
* Provably undetectable without the secret key. *(**9:31**)*
*Limitations:*
* Detection complexity scales with RNG sequence length, which needs to be large. *(**41:00**)*
* Less effective on low-entropy text with limited creative choices. *(**39:43**)*
*Future Work:*
* Resolve tension between robustness, distortion-freeness, and detection complexity. *(**52:03**)*
* Design more powerful hypothesis tests and explore alternative decoder functions for stronger watermark correlations. *(**52:45**)*
i used gemini 1.5 pro to summarize the transcript
@mikono2022 2 หลายเดือนก่อน
i am the detector. plz give me secret random number sequence?
@wolpumba4099 2 หลายเดือนก่อน
@@mikono2022 haha!
@MikkoRantalainen 4 หลายเดือนก่อน
How about following watermark?
(1) Generate a secret key, 256 bits
(2) While LLM is generating next output token, compute HMAC-SHA-256 where secret is the secret key catenated with previous token and next output token until first bit of HMAC-SHA-256 is zero. This requires ability to recompute next token using next random value in LLM mechanism. On average this will require one extra computation per two tokens.
Verification can simply tokenize the input and compute the hash for token combinations. Since there's only 50% probability to get the first hash bit correct (zero), you get 1/2^128 accuracy for the watermark for 128 token sequence. And any shorter sequences will have accuracy depending on the token count for that sequence.
The verification process only needs the secret key and the tokenizer.
@tophertmg 6 หลายเดือนก่อน
Sounds like a purposeful manipulation technique to program humans to think, and ultimately vote, the ‘correct’ way?

ต่อไป

เล่นอัตโนมัติ

A Multi Dimensional Online Contention Resolution Scheme

A Multi Dimensional Online Contention Resolution Scheme

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

Privacy Preserving ML with Fully Homomorphic Encryption

Privacy Preserving ML with Fully Homomorphic Encryption

How to treat Acne💉

How to treat Acne💉

มายคราฟ, แต่ ไลค์ = หัวใจ!

มายคราฟ, แต่ ไลค์ = หัวใจ!

ไทยพลิกแซงสิงคโปร์ 2-4! อาเซียนยกเป็นแมตช์สุดมันส์!! เหงียนชมดูไทยเล่นสนุกจริง!

ไทยพลิกแซงสิงคโปร์ 2-4! อาเซียนยกเป็นแมตช์สุดมันส์!! เหงียนชมดูไทยเล่นสนุกจริง!

ซินเดอเรลล่ากลายเป็นภรรยาของลุงสุดหล่อหลังจากคืนโรแมนติกนั้น ไม่รู้ว่าเธอได้พบกับมหาเศรษฐี

ซินเดอเรลล่ากลายเป็นภรรยาของลุงสุดหล่อหลังจากคืนโรแมนติกนั้น ไม่รู้ว่าเธอได้พบกับมหาเศรษฐี

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Rupert Sheldrake on being banned by TED

Rupert Sheldrake on being banned by TED

How AI was Stolen

How AI was Stolen

The Oldest Unsolved Problem in Math

The Oldest Unsolved Problem in Math

How to Make Learning as Addictive as Social Media | Duolingo's Luis Von Ahn | TED

How to Make Learning as Addictive as Social Media | Duolingo's Luis Von Ahn | TED

Something Strange Happens When You Follow Einstein's Math

Something Strange Happens When You Follow Einstein's Math

AI ROBOTS Are Becoming TOO REAL! - Shocking AI & Robotics 2024 Updates #1

AI ROBOTS Are Becoming TOO REAL! - Shocking AI & Robotics 2024 Updates #1

The Data Minimization Principle in Machine Learning

The Data Minimization Principle in Machine Learning

KAN: Kolmogorov-Arnold Networks

KAN: Kolmogorov-Arnold Networks

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 1

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 1

Oren helps Durple escape Pinki in a way you wouldn't expect

Oren helps Durple escape Pinki in a way you wouldn't expect

หมวกกันน็อค - TaitosmitH |Official MV|

หมวกกันน็อค - TaitosmitH |Official MV|

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม

【หนังพากย์ไทย】ยอดฝีมือสังหารนักโทษ แต่นักโทษเป็นปรมาจารย์กังฟูที่ซ่อนอยู่ เขาจัดการทั้งหมดในทันที

【หนังพากย์ไทย】ยอดฝีมือสังหารนักโทษ แต่นักโทษเป็นปรมาจารย์กังฟูที่ซ่อนอยู่ เขาจัดการทั้งหมดในทันที

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?