AI Systems Are Ingesting Canadian Journalism Without Attribution
Aengus Bridgman and I set out to measure what AI models actually do with Canadian journalism, and today we're releasing what we found. When we tested four major AI systems about Canadian news events, they provided no source attribution 82% of the time, delivering journalism that was clearly drawn from Canadian reporting without any indication of where it came from.
We tested ChatGPT, Gemini, Claude, and Grok on over 2,200 Canadian news stories in English and French, running more than 18,000 queries. The full methodology is in the technical brief linked below.
Two findings seem hard to dispute. These systems have ingested Canadian journalism systematically. The specificity of their knowledge of domestic politics, provincial affairs, and local reporting points clearly to Canadian news sources. And they rarely tell you where the information came from. Attribution rates varied enormously across models, which suggests this is a design choice, not a technical constraint. When we named the outlet and asked for a citation, attribution jumped to between 74 and 97%. The infrastructure exists, it is just not being used by default.
When given web access and asked about specific recent articles, models covered enough of the original reporting to substitute for the source in most cases (varying by model), while naming the originating outlet in fewer than one in six responses. There are real limits to measuring "substitution," and reasonable people will draw these lines differently. But the pattern is consistent enough across models and outlets to warrant serious policy attention.
The policy questions that follow are real and unresolved. The Online News Act doesn't reach this and copyright law is untested. While some American publishers have secured AI licensing agreements worth hundreds of millions, no comparable deals exist in Canada, where the major publishers have instead pursued litigation. We lay out several possible policy responses in the memo, but each involves trade-offs that deserve serious scrutiny.
What we are confident about is that the rules governing how AI companies use journalism are being set right now, by default, through inaction. Our contribution is empirical, seeking to make the scale and pattern of that use visible, so the policy debate can proceed on evidence rather than assumption.
The full technical brief and policy memo are here.