Commentary

Consultation on Copyright in the Age of Generative Artificial Intelligence

The Consultation on Copyright in the Age of Generative Artificial Intelligence was an online questionnaire released by the Federal Government on 12 October 2023 and open to all individuals and organizations. The consultation survey closed on 15 January 2024.

The survey was divided into four sections for feedback and discussion: Technical Evidence, Text and Data Mining, Authorship and Ownership of Works Generated by AI, and Infringement and Liability regarding AI. The survey concluded with a final section for Comments and Suggestions.

The U of A Copyright Office submitted its responses to this consultation on 11 January 2024. Those responses are included below.

Technical Evidence

Post-secondary institutions are involved in all sides of the development and use of assistive and generative AI tools.

A generative AI tool is largely developed by humans, with the selection, accessing, and ingestion of content for training datasets directed by humans, the prompts to direct the generative AI tool to create outputs normally provided by humans, and the decision regarding whether and how to distribute those outputs generally made by humans.

Where there is an alleged copyright infringement in relation a generative AI tool, it could be either that the generative AI tool’s accessing and reproducing copyright-protected works as part of the training datasets is alleged to be infringing, or it could be that the generated output that is made public is alleged to be infringing.

Developers of generative AI tools should be aware of copyright considerations in relation to the text and data those tools ingest for their training datasets. The type of outputs that the tool is designed to generate – which may be the ultimate purpose of the tool – might have a bearing on whether and the extent to which the use by the tool of any copyright-protected text and data might be found to be infringing. This determination will generally involve a fair dealing assessment, in cases where there has been no other authorization to access and use the copyright protected text and data.

Users of these tools are often not the developers of these tools. Therefore, for users who are publishing outputs from generative AI tools, transparency regarding what tools have been used and how they are used in relation to AI-generated outputs, along with good record keeping regarding these, may be useful voluntary measures that can be undertaken to assist in the defense of cases where those outputs are alleged to be infringing and thereby to mitigate legal risks.

Text and Data Mining

TDM is an important research tool, incorporating the latest technology to save time and expense in gathering and analyzing data for research purposes. There are a broad range of texts and datasets accessed via TDM, and not all such sources, nor the ultimate uses of the research outputs, give rise to copyright concerns.

Regarding copyright licenses from rights-holders that might govern TDM activities, these may be beneficial in certain cases where the use of TDM would not reasonably be covered under fair dealing or any other exception. However, to the extent that fair dealing is applicable to TDM activities, it is important to note that “the availability of a license is not relevant to deciding whether a dealing has been fair.” (CCH Canadian Ltd. v. Law Society of Upper Canada [2004] 1 S.C.R. 339 at paragraph 70).

As the use of generative AI tools broadens and grows, it will be increasingly significant to understand the impacts of any limitations placed on the material those learning datasets can draw upon. If the datasets are not appropriately current and drawn from appropriately diverse sources, the output could be outdated or biased.

In any discussion regarding adding clarity around copyright and TDM in Canada, “clarity” is the operative word. A broad range of cases of TDM could be reasonably considered to be fair dealings, based on the purpose of the ultimate use of the outputs derived from the TDM activity, in conjunction with the other factors of a fair dealing analysis.

There are always (or should always be) concerns about imposing strict limitations in relation to activities that have such a broad range of potential approaches and uses. To establish such strict parameters in the statute might not appropriately allow for new or unforeseen approaches or uses. It would be unfortunate if any such bright lines led to a result other than what a proper fair dealing analysis might determine.

Acknowledging that not all uses of outputs derived from TDM would be fair dealing, it would be useful to make clear to the TDM community that fair dealing is available to them, should an objection be raised under the Copyright Act to the TDM activity itself or to how the outputs arising from that activity are made publicly available.

Any such added “clarity” should do nothing to reduce or limit the application of fair dealing in cases of TDM. This added clarity might take the form of a “TDM exception” that would explicitly allow for the use of TDM in certain defined ways that are deemed to be in the public interest. If such a TDM exception were to become part of the Copyright Act, the application of that exception should not be limited by the terms of contracts governing access to content where those terms purport to override terms of the Act.

Regarding imposing obligations for record-keeping and disclosure regarding content accessed, such an approach would be undesirable and impractical. While voluntary record-keeping regarding content accessed might be a good practice, obligatory practices would be difficult to monitor and enforce, and such obligations would only serve to create a potential violation under the Act that is not copyright infringement. As was mentioned earlier, maintaining such records may be useful to the developer in the event of a claim of copyright infringement against any output generated through its generative AI tool, but that is not a compelling reason to make the burden of such record-keeping obligatory.

Regarding other jurisdictions, the US Copyright Office has issued a notice of inquiry regarding artificial intelligence and copyright. This notice of inquiry was issued on 30 August 2023, and the due date for reply comments has been extended to 06 December 2023. The outcome of that inquiry may be a useful supplement to the results of this consultation. Additionally, the European Union has recently agreed upon a new law known as the A.I. Act.

Authorship and Ownership of Works Generated by AI

There are clearly economic interests that would arise from a determination of authorship or copyright ownership for certain outputs from generative AI tools. In such cases, the determination of who would be the rightsholder in those outputs and what level of copyright protection would apply for what length of term might be important to the parties involved. However, the fact that these interests in AI-generated content are important to certain parties is not sufficient to determine whether it would be in the broader public interest to provide any level of statutory protection to this content under copyright law.

One of the foundational purposes of copyright protection is to protect the interests of (human) authors, ensuring that they can receive a just reward for their creative works. This is intended to provide such (human) authors with an incentive to spend the time and effort and skill and judgment to produce those new creative works. Providing this copyright protection to such (human) authors is in the public interest, as it encourages the ongoing creation of new works that benefit the public.

The extent to which the works created by a generative AI tool require such copyright protection to incentivize their creation is much less clear. While it may be in the public interest to ensure that the developers of generative AI tools can benefit from the value of the outputs of those tools, this benefit can readily be derived from user fees, perhaps including royalties, and other terms of use associated with their tools. Copyright protection is not necessary to further incentivize the developers of these tools, and there is no compelling reason to “reward” the users of these tools with copyright protection in their outputs.

In cases where there is significant human contribution to a “work” that also had a significant contribution from a generative-AI tool, there would be the need to establish reasonable thresholds of human skill and judgment relative to the contribution of the tool to reach the level of “human authorship” and thus copyright protection, as well as effective means of determining whether such thresholds have been reached.

Infringement and Liability regarding AI

Assuming no new changes in what counts as an infringing work under the Copyright Act, then the existing legal tests that are currently applied to decide whether a work authored (entirely) by a human is infringing should be sufficient to determine whether a work generated (in whole or in part) by an AI tool is infringing.

In relation to the outputs from generative AI tools, while it may be good practice for the developers/administrators of generative AI tools to track what copyright-protected content has been accessed and/or copied in the development of the learning dataset, this may be most useful as a defense to copyright infringement. Otherwise, the existing tests for similarities between works in determining whether one infringes upon the other in copyright disputes should be sufficient. The courts are well-equipped to decide these issues.

Regarding providing greater clarity on where liability lies if an AI-generated work is found to be infringing, again, the operative word is “clarity”. The owner/controller of the generative AI tool might reasonably be liable, at least in part, if the outputs of that tool too closely resemble existing copyright-protected works. Similarly, the individuals who made the decision of what prompts to provide the generative AI tool, and whether and how to distribute the outputs from that tool, might also reasonably be liable, at least in part, in cases where those outputs are found to be infringing.

However, given the very broad range of possible specific circumstances for such cases, it would be important that any such additional “clarity” would not limit the range of liability that might apply to any of these parties. There are a number of human decisions at play at all stages of the process, by each of the parties to the process, and it is the extent of the connection between those decisions and the specific nature of the alleged infringement that will determine whether and the extent to which each of the parties involved in the matter will be liable, or not be liable, for the ultimate distribution of an infringing output. Again, existing approaches to determining and apportioning liability should be sufficient in such cases, and the courts are well-equipped to decide these matters.

Comments and Suggestions

At the end of the day, generative AI is a tool to serve human purposes, some of which involve the generation of outputs that on their faces might give rise to copyright protection. The statutory copyright regime is in place to provide protection to works where that protection benefits the broader public interest. Before the choice is made to expand or modify copyright protection of works to accommodate the use of generative AI tools to create outputs that are published, it should be properly determined whether existing rights, protections and exceptions under the Copyright Act, and the underlying goals of providing those rights, protections and exceptions, might be sufficient as they are to address any issues that arise.