Reports have been circulating that The Juilliard School is collaborating with OpenAI. One source says OpenAI is working with Juilliard students to annotate music scores for use as training. There’s just one important fact to know: Juilliard, the institution, has no involvement.

This may be as much an indication of the breakneck pace of misinformation on the continually devolving internet as anything. First, a paywalled story at The Information reported “OpenAI Moves to Generate AI Music in Potential Rivalry With Startup Suno.” That much may well be the case — OpenAI trying to compete with Suno even seems like a fair bet.

The keyword here is students — as The Information reported that, according to an unnamed source, “the company has been working with some students from the Juilliard School to annotate music scores.” Some read this to mean official involvement by the institution.

Juilliard’s press office answered my query and said, flatly, there is no project involving The Juilliard School as an institution. (Disclosure: I’ve also “collaborated with Juilliard students” without necessarily the formal involvement of the school, as has about everyone who’s ever been involved in concert music in New York City.)

Erin Woo, the co-author of the story, confirmed that “our reporting is that they’re working with Juilliard students, not with the school itself.”

I also reached out to OpenAI; I have not yet received a response.

But there’s another aspect to this story. Try either a search query or (ahem) OpenAI and ask them the same question. What you’ll see is a bunch of other news stories picking up on some vague details from this article, evidently by scraping the existing news sources, and implying that this is a breakthrough because of Juilliard’s imagined involvement. There are a bunch of questions here. How is ChatGPT able to scrape a paywalled site, or are its references based solely on what it can crawl, like the headline and opening copy? (Separate stories appear close enough to my ChatGPT results — see screenshot below — that I expect that was what was used to write some of the stories directly.)

This is part of why LLMs are a nightmare for accurate reporting, reputation management, and privacy. Benn Jordan learned that the hard way when Google’s Gemini hallucinated an entire saga about him making a video about a trip to Israel, which never happened. See the original discussion on his BlueSky account; Matthew Gault did a great job covering this story for 404media:

Google AI Falsely Says YouTuber Visited Israel, Forcing Him to Deal With Backlash

Ironically, as we do the real work of correcting the record, LLMs continue to scrape our corrections, sometimes incorporating them in our answers without attribution, meaning that the same companies causing the problem in the first place can exploit our labor in fixing it. They can then also include our correct information alongside new, incorrect information.

Here, since you’ve read through this downer of a Monday story, you’ve earned some Jonathan Frakes. I hear his voice every time I encounter misinformation now; it makes everything more bearable.

Here are the ChatGPT results if I wrote a prompt about Juilliard and OpenAI on Monday morning NYC time — and again, note that despite the various links here, the information about Juilliard (the institution) is misleading or false:

OpenAI screenshot:

. What is the collaboration
• OpenAl is reportedly working on a generative music tool (i.e., a system that can create music from text and or audio prompts).
euronews +4
• Juilliard students are said to be assisting by annotating musical scores and helping to build or supply training-data that the music-Al can learn from. For example:
"working with Juilliard School students to annotate music scores" is repeated in multiple sources.
The Information +2
• The reports suggest the collaboration is about quality of data (i.e., ensuring the Al understands structure, rhythm, harmony) rather than Juilliard providing finished musical works. For example: "process that could help train the Al to recognize
structure, rhythm, and harmony in complex compositions."
Newswav +1
• OpenAl is entering a domain (music generation) that is more creative and less clearly defined than its previous major focus (text, images). The collaboration with Juilliard looks to lend domain-expertise (music theory, composition) to the project.
V Why it matters
• Bridging tech + high-art domain: Juilliard has a strong reputation in music education and performance. Bringing that together with OpenAl's Al-capabilities suggests an effort to raise the bar for Al-generated music in terms of musicality and sophistication.
• Training-data quality: Music is complex — there's harmony, rhythm, structure, instrumentation, style. Using annotated scores (rather than just scraped audio) means better understanding inside the odel of "how music works," which could

It goes on like that, but I’ll spare you the details.

I think it’s very likely that the use of annotated scores was something Sora and other products were already doing, based on my knowledge of how these models work and the precision of some of the musical results. It would also stand to reason that they wouldn’t want to talk about this, as it could implicate them further in intellectual property cases — but I can only speculate there.