SynthID, AI watermarking

SynthID is becoming one of the most important attempts to answer a simple but difficult question: how can you tell whether an image, video, audio clip or piece of text was generated by AI? Google DeepMind designed SynthID to embed invisible digital watermarks directly into AI generated content, and recent adoption by companies including OpenAI, Nvidia, Kakao and ElevenLabs suggests that the technology is moving from a Google project into a broader AI transparency layer.

Visual clues are no longer enough. A few years ago, AI images often betrayed themselves with strange hands, distorted text or impossible lighting. Today, many synthetic images and videos look realistic at first glance. Voice generation has improved as well, and AI written text can be difficult to separate from human writing without context. SynthID does not solve every trust problem in digital media, but it offers a practical way to make some AI generated content easier to identify after it has been created, shared, compressed and edited.

What SynthID is

SynthID is a watermarking and detection technology from Google DeepMind. It adds an invisible digital signal to AI generated media at the moment that media is created. The watermark is embedded into an AI generated image or video segment without changing visible quality. It is designed to survive common modifications such as cropping, filters, frame rate changes and lossy compression.

The same broad idea now applies across several types of content. SynthID can be used for images, video, audio and text, although the technical method differs by format. In an image or video, the watermark is tied to the visual signal. In audio, it is embedded in the waveform. In text, it influences the token generation process so that a detector can later estimate whether the text was produced by a model using a particular watermarking configuration.

Ars Technica reported in May 2026 that Google says SynthID has already been used to label 100 billion images and videos, plus 60,000 years of audio. Those numbers are significant because watermarking systems become more useful when they are widely deployed. A detector is only helpful if the content being checked may actually contain the mark it is looking for.

Why AI watermarking is becoming harder and more important

The case for SynthID is strongest when you look at how generative AI content moves online. A synthetic image may be created in one tool, posted to a social platform, downloaded by someone else, resized, compressed and reposted in a video montage. A piece of audio may be clipped, normalized or mixed with background sound. A video may be cropped vertically for short form sharing. If a watermark disappears after basic editing, it will not be very useful in the real world.

This is why Google emphasizes robustness. Pushmeet Kohli, a Google DeepMind scientist, told Ars Technica, “A technology like this will always be attacked.” He also said that the team researched how to make SynthID robust against different transformations. AI watermarking is not just a labeling feature. It is an adversarial system. Some people will use it responsibly, some will ignore it and some will actively try to remove it.

For everyday users, the value is more practical. If you see a suspicious image or hear a questionable audio clip, a reliable watermark detector can give you more evidence.

SynthID versus C2PA metadata

Google is not relying on SynthID alone. Google also supports the C2PA standard, which attaches metadata describing how content was created or processed. C2PA is useful because it can carry provenance information in a structured way. For example, Google began using C2PA more prominently with Pixel 10 smartphones. Photos taken with those devices can include metadata describing how they were processed, and images with generative elements can receive an AI tag.

Google is also bringing similar C2PA information to videos recorded on Pixel 8, Pixel 9 and Pixel 10 phones through an update. Gemini is gaining C2PA scanning so it can explain a file’s provenance based on content labels. Chrome and Search are expected to receive similar capabilities as well.

The weakness is that metadata can be stripped, altered or lost as files move between apps and platforms. That does not make C2PA useless. It means metadata works best as one layer in a broader system. SynthID takes a different approach because the mark is embedded into the content itself. In simple terms, C2PA is like a label attached to a package, while SynthID is closer to a hidden signal inside the package.

The strongest transparency strategy is not SynthID or C2PA. It is SynthID and C2PA, combined with platform policies, content review, source verification and media literacy.

How SynthID works across different content types

Images and video

For images and video, SynthID places an imperceptible watermark into the generated pixels or video segments. Google DeepMind says this does not reduce image or video quality. The point is to preserve the experience for viewers while allowing authorized detection tools to find the watermark later.

This is especially relevant for AI video. A single generated video can be cut into clips, reframed for different aspect ratios and compressed by multiple platforms. If the watermark remains detectable after those transformations, it becomes much more useful for journalists, researchers, platforms and businesses that need to assess authenticity.

Audio

For audio, SynthID embeds the watermark into the waveform. Ars Technica specifically mentions AI songs and audio overviews from products such as NotebookLM. That is important because audio verification is becoming a larger concern. Synthetic voices can be used for harmless narration, accessibility and creative production, but they can also be used for impersonation or misleading clips.

A watermark in audio can help establish whether a clip was generated by a system that applies SynthID. It does not identify who created the clip or why it was created. It also does not automatically prove that an unmarked clip is real. But it adds a meaningful signal when the watermark is present.

Text

Text watermarking is technically different because there are no pixels or waveforms to modify. Google’s developer documentation describes SynthID Text as a logits processor that is applied during generation after Top K and Top P sampling. In plain language, it gently influences the model’s word choices in a pattern that should not be noticeable to readers but can be detected statistically later.

Google has open sourced SynthID Text, and a production grade implementation is available in Hugging Face Transformers v4.46.0 and newer. Developers can use a watermarking configuration with model.generate to activate SynthID Text in compatible workflows. Hugging Face explains that each watermark needs a detector trained to recognize it, and that the watermarking configuration should be stored securely. If the configuration is exposed, others may be able to replicate or attack the watermark.

Text watermarking also has clearer limitations than image watermarking. Google’s documentation says detection confidence can drop if text is heavily rewritten or translated. It also notes that watermarking is less effective for factual responses because the model has less freedom to vary its output without hurting accuracy. That nuance matters. A text watermark is not a magic authorship test. It is a probabilistic signal tied to a particular generation process.

The expansion beyond Google

The biggest recent SynthID development is not technical. It is ecosystem related. Google has partnered with several companies to bring SynthID into more AI systems. Nvidia will implement SynthID in its Cosmos world foundation models. OpenAI will use SynthID in its GPT image systems. Kakao and ElevenLabs will also begin adding SynthID to their AI content.

This matters because watermarking only works at scale if many creators of AI systems participate. Before this expansion, SynthID was most useful for identifying content generated by Google’s own AI models. That still left a large gap, since the internet contains AI content from many commercial tools, open models and custom systems.

Adoption by companies such as OpenAI and Nvidia does not close the gap entirely. Many public models still generate content without AI watermarking. Open models can be modified or trained privately. Some users will choose tools specifically because they do not add watermarks. Still, broader adoption means more AI content will carry a signal that detection tools can read.

How SynthID detection works

Detection is where watermarking becomes visible to the public. Google has already added SynthID detection support in the Gemini app, where users can upload suspected content and ask whether it is AI generated. Future checks will become more accessible through Circle to Search, Lens and AI Mode. Users will also be able to use Gemini in Chrome by sharing a tab with the content in question and asking whether it is AI.

Google has also announced SynthID Detector, a verification portal for identifying AI generated content made with Google AI. The portal can scan images, audio, video and text for SynthID watermarks. When it finds a mark, it can highlight parts of the content that are more likely to have been watermarked. For audio, that may mean specific segments. For images, it may mean particular areas.

Access is still controlled. Google has said the detector is being rolled out to early testers, with journalists, media professionals and researchers able to join a waitlist. That limited release reflects a real security tension. If detection is too open and too detailed, attackers may use it to test whether their watermark removal attempts worked.

There is currently no public API for SynthID. Google is preparing an AI content detection API as part of the Gemini Enterprise Agent Platform, aimed at trusted business partners. This suggests that Google is trying to balance practical verification with the risk of creating a tool that helps adversaries reverse engineer the watermark.

What SynthID cannot prove

SynthID is useful, but it should not be treated as a universal truth machine. A positive detection can tell you that content likely contains a SynthID watermark. It does not necessarily tell you the author’s intent, the full editing history or whether the content is misleading. A real image can be used deceptively, and an AI generated image can be clearly labeled satire, art or concept design.

A negative result also has limits. If no SynthID watermark is detected, that does not prove the content is human made. It may have been generated by a system that does not use SynthID. It may have been modified in a way that reduces detection confidence. It may come from an open model, a private model or another watermarking system entirely.

There is also the problem of partial media. A video might combine real footage, AI generated inserts, synthetic voiceover and edited captions. A single yes or no answer may not capture that complexity. This is why tools that highlight likely watermarked segments are more useful than simple binary labels.

For text, the uncertainty is even more important. Google’s developer documentation describes detection as probabilistic, with possible states such as watermarked, not watermarked and uncertain. That is the right approach. In many real cases, uncertainty is more honest than overconfident classification.

What SynthID means for creators, publishers and platforms

For creators, SynthID could become part of responsible publishing. If you use AI tools for images, video, audio or text, embedded watermarking can help reduce confusion later. It can also protect legitimate creators from false claims that they tried to pass synthetic content off as real, provided the surrounding context is clear.

For publishers and journalists, SynthID is a verification aid. It can support newsroom checks, especially when content is circulating quickly and source information is weak. It should sit alongside reverse image search, metadata inspection, source tracing, expert review and direct confirmation.

For platforms, watermark detection can help moderation systems prioritize review. It can also support labels that inform users when content is AI generated. But platforms need to avoid simplistic enforcement. Not all AI generated content is harmful, and not all harmful content is AI generated.

For businesses, the most immediate use may be risk management. Brands may want to detect synthetic media that impersonates executives, products or campaigns. Enterprises that generate AI content at scale may also want internal watermarking policies so assets can be traced and classified later.

The trust layer is still incomplete

SynthID is a serious step toward better AI content transparency because it embeds signals inside the media itself and because it is expanding beyond Google’s own tools. Its value will grow as more model providers adopt it and as detection becomes easier to access through products such as Gemini, Lens, Chrome and enterprise APIs.

Watermarking helps identify content from systems that choose to participate. It does not identify every AI generated file on the internet. SynthID can make the digital information environment more accountable, but trust will still depend on context, incentives and human judgment.