It is hard to believe that the now-ubiquitous ChatGPT did not make its official debut until last month (although a beta version was released in late 2022).
The past 12 months have seen an explosion of artificial intelligence (AI) technologies from platforms that generate pictures to manage data. But as always, the rise of new technologies leads to ancillary problems.
Intellectual property scholars and practitioners are concerned that the gathering of data into platforms’ databases might infringe the copyrights of those whose works are posted on the internet and then collected by the AI programs. In addition, there are questions about whether the material produced by this generative software also infringes copyrighted works.
In recent months, artificial intelligence and machine-learning algorithms such as ChatGPT and Dalle-2 have shown that AI’s capabilities are growing exponentially. With these advancements, however, there have also been several new ethical and legal questions that need to be considered and that will become increasingly significant as this technology advances even further.
AI and the Arts
In order for AI programs such as Chat GPT to work, they must have data. To gather this information, the programs scour the internet and gather information from websites. This collected intelligence is called “training data." The programs then use this data to respond to requests for output from users.
Additionally, the programs’ owners can then sell this data to other companies which make AI programs so that those companies can use the data to train their models.
But gathering such training data – or building a program off curated training data collected by others – has the potential to infringe copyright. Authors whose works are posted on the internet often retain copyright in those images and text. It is unclear at this point whether the processes that AI programs use to input this information into their systems violate the authors’ copyrights in this material data. At the moment there is no case law in this area.
Whether or not this method of collecting data infringes the copyrights of the data’s owners, there is another potential area of liability. The purpose of these programs is to generate content based on users’ requests. Does the content generated violate copyright owners’ rights? And if so, who is liable? Is it the person who requested that the program create a work? The programmer? The program’s owners?
The question of whether there is copyright infringement will hinge on the nature of the output: does the new content reproduce recognizable portions of the copied works? If it does, then the question of who is liable will depend on two things: the request made by the user, and the way that the program has been designed.
With the ability to prompt an AI bot to make works of art there comes another question: what happens if it copies someone else's (i.e., another human's work? Who is liable? Is it the person who asked the machine to make the image, is it the programmer who designed the bot, or is it neither?
Professor Jane Ginsburg, the Morton Janklow Professor of Literary and Artistic Property Law at Columbia Law School and an expert in copyright law, explains that the answer lies in the prompt which the user is utilizing.
She notes that it is unlikely for the user to be held responsible for infringement unless he or she were to request something along the lines of an exact duplicate of an image, as in asking, “make me a copy of a Warhol."
Ginsburg explains that, “the programmers' liability would probably depend on what they designed the program to do”. Ideally, the technology would be used to generate new types of works rather than copies of other people’s already-created work product. One way programmers could protect themselves would be to run the program through an online filter to check that there are not identifiable parts of other peoples’ works in the program’s outputs.
It is unlikely, however, that those programmers who created the bots who scrape the web for the training data would be liable for the infringing outputs. Professor Ginsburg says, “There probably isn’t enough of a connection between merely acquiring content and subsequent user-prompted outputs. The scrapers might be liable for copying works into the system, but that’s a different question.”
Another key issue surrounding the output of these programs is whether or not they are copyrightable. The United States Copyright Office has ruled that only works authored by humans are eligible for copyright. There are currently two cases pending in US courts challenging this and the Copyright Office has said they are continuing to review the copyrightability of AI generated works as the technology develops. As one New York lawyer noted, “In response to this changing landscape, the Copyright Office has been soliciting insights from relevant experts and industry stakeholders, and so far has been taking a measured, case-specific approach in applying existing precedent to these new use cases.”
Unanswered Questions
We are living in a world where AI and machine learning are constantly changing and adapting. Everyday these programs get more and more advanced. Because of this, many questions still remain unanswered, copyright issues among them.
In many ways it is too early to determine how AI technology will change the world we live in and how society will react to those changes. AI will certainly affect many areas of the law and it will be interesting to see how areas such as copyright law adapt to this brave new world.
Teddy Almond is a student at the Hudson School in Hoboken.
See more of his work at: https://www.teddyalmondphotography.com