OpenAI is developing technology capable of generating music from text descriptions and audio samples, according to a report by American technology publication The Information.
The proposed system would enable users to add musical elements to existing video content or layer instrumental accompaniment onto vocal recordings, sources familiar with the project told the outlet. A guitarist's backing track could be synthesised for an a cappella performance, for instance, or background music created for footage lacking audio.
The timing and format of any potential launch remains unclear. OpenAI may release the capability as a standalone product or integrate it within existing offerings such as ChatGPT or video generation tool Sora, according to the report.
The San Francisco-based artificial intelligence company is reportedly collaborating with students from New York's prestigious Juilliard School to annotate musical scores, providing training data for the system. This arrangement would offer the AI examples of how music is structured and notated, potentially improving output quality.
While OpenAI previously released generative music models, those efforts preceded ChatGPT's November 2022 launch, the product that transformed the company from a specialist research laboratory into a household name. Since that breakthrough, OpenAI's audio development has concentrated on speech applications: converting text to spoken word and transcribing speech to text.
The company would enter a competitive market where rivals have already established themselves. Google offers generative music capabilities, while startup Suno has built a business specifically around AI-generated music, allowing users to create songs through text prompts describing desired style, mood and instrumentation.
Generative audio technology has attracted both enthusiasm and concern. Musicians and composers worry that AI systems trained on existing music could devalue human creativity and complicate copyright protections, while some see potential for new creative tools that augment rather than replace human artists.
Record labels and publishers have grown increasingly anxious about AI music generation, with several filing lawsuits alleging that training systems on copyrighted material without permission constitutes infringement. The legal landscape remains unsettled, with courts yet to establish clear precedents about whether such training violates intellectual property rights.
OpenAI's exploration of music generation follows its expansion into various media types. The company offers image generation through DALL-E, video creation via Sora, and voice synthesis capabilities. Adding music would round out a multimedia AI toolkit, though questions about licensing, quality and cultural impact accompany each new media type.
Whether consumers and professionals will embrace AI-generated music at scale remains uncertain. Early adopters have used existing tools to create novelty songs and background tracks, but widespread acceptance in professional contexts faces hurdles including quality concerns, ethical objections and potential legal complications.
The Juilliard collaboration, if accurate, suggests OpenAI recognises that high-quality music generation requires sophisticated understanding of musical theory and composition—knowledge that classical music students spend years acquiring. Whether machine learning systems can absorb and apply such expertise through annotated examples represents a technical question with artistic implications.


