
AI lab – AI in Action | Episode 03: AI Tokenization
Let’s talk about AI tokenization in this third episode of our AI in Action series. Tokenization is actually pretty interesting, especially if you ever wondered how these fancy AI machines understand the stuff we type and say and produce things when we give them prompts. Next time you're marvelling at an AI-generated text, remember it's all about those tiny tokens, dancing together in a complex symphony of language and prediction.
16 Sep 20247min

AI lab TL;DR | Bertin Martens - The Economics of GenAI & Copyright
🔍 In this TL;DR episode, Dr. Bertin Martens (Bruegel) discusses his working paper for the Brussels-based economic think tank on the economic arguments in favour of reducing copyright protection for generative AI inputs and outputs with the AI lab* 9:44: Mr Martens intended to say "humans" instead of machines📌 TL;DR Highlights⏲️[00:00] Intro⏲️[01:21] Q1-Balancing Innovation & RightsCan the TDM opt-out right hinder innovation and economic growth, and what does it mean as regards the power of copyright holders vs. the potential societal benefits of generative AI?⏲️[05:42] Q2-Licensing Impact on EU AI CompetitivenessWhat are the implications of licensing for genAI as regards competitiveness and quality of models and potential economic disadvantages for EU AI developers?⏲️[09:11] Q3-GenAI's Impact on Creative Industries & EconomyLooking at outputs, how could genAI impact the creative industries and the broader economy, and what are your thoughts on how policy should evolve to reflect this?⏲️[13:08] Wrap-up & Outro💭 Q1-Balancing Innovation & Rights🗣️ Copyright is an economic policy tool to stimulate investment in the production of artwork, and granting an exclusive copyright to an author avoids free writing on that artwork that would undermine the incentives to invest in its production.🗣️ The optimal scope of copyright protection should balance, on the one hand, the welfare losses from this exclusive right given to an author against the welfare gains for society from stimulating investment in new and innovative productions.🗣️ Both [copyright] overprotection and underprotection are bad. They will hamper innovation and reduce the economic efficiency of copyright.🗣️ Generative AI opens up new and much cheaper possibilities to produce new and innovative artwork, and also has applications in a wide variety of other sectors outside the media sector and across the economy.🗣️ The AI Act and the copyright law in Europe give priority to the private interest of copyright holders over the wider interest of society, and I don't think that's a good thing and we should change that.💭 Q2-Licensing Impact on EU AI Competitiveness🗣️ Generative AI models require vast amounts of training data to develop the model and to have a high-quality model. And already today we observe that the largest and most advanced models are running out of high-quality human edited text for model training.🗣️ There is still sufficient supply of low-quality text data, for instance from social media, or from the transposition of voice data into text, or even from synthetic data. But all these low-quality sources reduce the quality of generative AI models.🗣️ Imposing copyright licensing requirements on text data for model training will further shrink the available supply of text data for model training, and that will further reduce the quality of these models.🗣️ Only the biggest tech companies can actually afford to negotiate the licensing fees and pay those fees to copyright holders, while smaller AI startups cannot afford this and are pushed out of the market.🗣️ Pushing smaller AI startups out of the market is bad for competition, bad for innovation in the AI setting, and this is not the way we want to go.💭 Q3-GenAI's Impact on Creative Industries & Economy🗣️ Generally, copyright law worldwide grants copyright only to human authors of artwork, not to machine-produced artwork. With the arrival of generative AI models, however, that has changed, and for the first time in human history, a machine can produce artwork output.🗣️ From an economic perspective, there is no need to grant copyright to AI-produced artwork because the marginal cost of producing generative AI output is actually very close to zero (...) and the risk of free riding, therefore, is very limited.🗣️ The human labor that goes into designing a prompt set that you feed into a generative AI model is costly, and this prompt set is human artwork and could indeed receive copyright protection, just like any other human design, text or computer code.📌 About Our Guest🎙️ Dr. Bertin Martens | Senior fellow at Bruegel and non-resident research fellow at TILEC, Tilburg University🌐 Bruegel | Economic Arguments in Favour of Reducing Copyright Protection for Generative AI Inputs and Outputshttps://www.bruegel.org/working-paper/economic-arguments-favour-reducing-copyright-protection-generative-ai-inputs-and🌐 Bruegelhttps://www.bruegel.org🌐 Tilburg Law & Economics Centre (TILEC)https://www.tilburguniversity.edu/research/institutes-and-research-groups/tilec🌐 Dr. Bertin Martens https://www.bruegel.org/people/bertin-martensDr. Bertin Martens is a Senior fellow at Bruegel and a non-resident research fellow at the Tilburg Law & Economics Centre (TILEC, Tilburg University). He has worked on digital economy issues as a senior economist at the European Commission's Joint Research Centre for over a decade until April 2022. Before that, he was deputy chief economist for trade policy at the EC.
9 Sep 202413min

AI lab TL;DR | Alexander Peukert - Copyright in the Artificial Intelligence Act–A Primer
🔍 In this TL;DR episode, Prof. Dr. Alexander Peukert (Goethe University Frankfurt am Main) discusses his primer on copyright in the EU AI Act with the AI lab📌 TL;DR Highlights⏲️[00:00] Intro⏲️[01:26] Q1-Merging copyright & AI regulation:What challenges arise from merging copyright law and AI regulation?How might this impact legislation, compliance, and enforcement?⏲️[06:08] Q2-AI Act copyright targets:Who are the main targets of the AI Act's copyright-related obligations?⏲️[09:33] Q3-AI Act copyright obligations:What key copyright-related obligations does the AI Act impose on AI model providers?How should training content summaries and TDM opt-out mechanisms be implemented?⏲️[14:44] Wrap-up & Outro💭 Q1 - Merging Copyright & Ai Regulation🗣️ Any copyright infringement triggers remedies. (...) In the EU AI Act context, it’s very different because the EU AI Act establishes systemic compliance obligations.🗣️ AI model providers have to put in place a general copyright policy. Whether that policy is sufficient or not is then a question which is pretty difficult to answer and not straightforward.🗣️ When we merge copyright with the AI regulation, (...) this is also true for the DSA, (...) you have to ask: at what point is a systemic compliance obligation violated? Only then do you have a violation of this AI regulation.🗣️ The AI Act is primarily enforced by public authorities (...). That might become a challenge for rightholders because they were used to enforce their rights at their will. Now they have to make sure that the [EC or national authorities act].🗣️ For the first time, (...) public authorities enter the copyright environment to a very significant extent through the EU AI Act.💭 Q2 - AI Act Copyright Targets🗣️ The specific copyright obligations are only addressed to general-purpose AI model providers. (...) AI systems that are then built upon it (...), which eventually create the output, are not subject to specific copyright obligations.🗣️ The EU legislature (...) said: we focus on the [general-purpose AI] models because they are the very basis of all systems, and if we target them (...), then we make sure that any kind of system, generative AI [is] copyright-compliant.💭 Q3 - AI Act Copyright Obligations🗣️ The EU AI Act [obliges] AI model providers to program their crawlers, who crawl the Internet, to collect data for [AI] training (...) in a manner that the opt-out of copyright holders is respected.🗣️ There’s a market for AI training data, which is based on these copyright rules in connection with the EU AI Act.🗣️ You have to put in place a copyright policy. (...) One potential consequence (...) might be a kind of moderation obligation so that you have to make sure that not only the training (...) but also the eventual output is copyright-compliant.🗣️ It might become difficult for the [general-purpose] AI model provider to moderate the output of systems that another company has built on [their] model. (...) I see a potential problem in the implementation of these copyright obligations.🗣️ The [training content] summary need not be granular so that you mention each and every URL that you have mined, (...) it suffices to describe the content in a narrative way. So what kind of databases have you searched or crawled?🗣️ The [training content] summary (...) is a tool to enable rightholders to figure out whether they were mined and perhaps whether their preventive measures were circumvented and (...) potentially sue for copyright infringement.📌 About Our Guest🎙️ Prof. Dr. Alexander Peukert | Full Professor of Civil, Commercial and Information Law at Goethe University Frankfurt am Main🌐 GRUR International | Copyright in the Artificial Intelligence Act – A Primerhttps://academic.oup.com/grurint/article-abstract/73/6/497/7675073https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4771976🌐 Prof. Dr. Alexander Peukerthttp://www.jura.uni-frankfurt.de/peukert/http://ssrn.com/author=1244916 Alexander Peukert (pronounce as Poikert) has since 2009 been full professor of civil, commercial and information law at Goethe University Frankfurt am Main. He studied law and obtained his Dr. iur. (s.c.l.) at the University of Freiburg (1993-1999). After his second state examination (2001), he practiced law in a Berlin law firm specializing in copyright and media law. From 2002 to 2009, he was senior research fellow and head of the U.S. department at the Max Planck Institute for Intellectual Property and Competition Law in Munich.
18 Jul 202415min

AI lab TL;DR | Thomas Margoni - Copyright Law & the Lifecycle of Machine Learning Models
🔍 In this TL;DR episode, Professor Thomas Margoni (CiTiP - Centre for IT & IP Law, KU Leuven) discusses copyright law and the lifecycle of machine learning models with the AI lab. The starting point is an article co-authored with Professor Martin Kretschmer (CREATe, University of Glasgow) and Dr Pinar Oruç (University of Manchester), and published in open access in the International Review of Intellectual Property and Competition Law (IIC).📌 TL;DR Highlights⏲️[00:00] Intro⏲️[01:26] Q1-Copyright & training data:How does current copyright law affect the training of machine learning models?What insights do your case studies provide? ⏲️[04:57] Q2-Surprising research findings:What did you learn about copyright law’s impact on machine learning innovation?⏲️[08:16] Q3-Policy recommendations:What changes to copyright law do you suggest to support machine learning development and research?⏲️[12:50] Wrap-up & Outro💭 Q1 - Copyright & Training Data🗣️ It is a complex relationship: machine learning is a very new technology, and copyright is a very old law (...) developed (...) in function of a very different (...) technology.🗣️ Every time a new technology appears (...), adjustment [of copyright law] is necessary. During this time (...) various interests [and] dynamics are at play.🗣️ A third interest that is naturally underrepresented (...) is that of users, citizens, people like us, who somehow get lost in this equation based on only two players[: right holders and AI developers].🗣️ Copyright has always been about the balance between authors and the public[,] between the need to incentivise cultural creation and the need for the public to have access to it.💭 Q2 - Surprising Research Findings🗣️ Be careful not to treat different cases following the same rules (...) [it] would lead to unbalanced solutions. (...) Different cases (...) are [now] treated almost entirely the same by EU copyright law.🗣️ Text and data mining: (...) could lead to identifying (...) the spread of a pandemic (...) This is a public-interest form of learning that can benefit the entire humanity. This type of activity should not be regulated by copyright.💭 Q3 - Policy Recommendations🗣️ The EU (...) developed a legal framework whereby text and data mining and machine learning are regulated the same. (...) Perhaps one of the answers (...) to creat[e] more (...) breathing space, particularly for scientific research, is to treat them differently.🗣️ The protection of research, freedom of scientific research and artistic expression are very important. (...) We have to design rules that do not prevent scientists [and] citizens (...) to experiment with these tools.🗣️ Right now, we regulate everything at the input level. (...) We have to move our regulatory focus: look more at the input and output data.🗣️ Due to the scale of AI applications, there is a danger raised by rightholders and some artists [of a] substitution effect (...) with a specific artist, school or genre. This (...) is a (...) new question, and (...) remuneration models (...) could be an (...) avenue to explore.📌 About Our Guest🎙️ Professor Thomas Margoni | Research Professor of IP Law at the Faculty of Law and Criminology and member of the Board of Directors of the Centre for IT & IP Law (CiTiP), KU Leuven🌐 International Review of IP & Competition Law (IIC) - Copyright Law and the Lifecycle of Machine Learning Modelshttps://doi.org/10.1007/s40319-023-01419-3 🌐 Prof. Thomas Margonihttps://www.law.kuleuven.be/citip/en/staff-members/staff/00137042Dr Thomas Margoni is a Research Professor of Intellectual Property Law at the Faculty of Law and Criminology of KU Leuven in Belgium. He is also a member of the Board of Directors of the Centre for IT & IP Law (CiTiP, KU Leuven).
11 Jul 202413min

AI lab - AI in Action | Episode 02: AI Terminology
Let’s talk about AI terminology in the second episode in our AI in Action series. The AI term gets thrown around more than a beach ball at a summer picnic, and it’s not always clear what people are talking about. “AI” is to tech what “food” is to a grocery store – sure, it covers a lot, but a hot dog and a filet mignon are pretty darn different when it comes to what they do to your insides. AI is a layered beast, like a high-tech set of Russian nesting dolls. You crack open the biggest one, and bam! There’s another one inside. Read more here:https://informationlabs.org/ai-lab-ai-in-action-episode-02-ai-terminology/
2 Jul 202410min

AI lab TL;DR | Elisa Giomi - The Unacknowledged AI Revolution in the Media & Creative Industries
🔍 In this TL;DR episode, Dr Elisa Giomi, Associate Professor at the Roma Tre University and Commissioner of the Italian Communications Regulatory Authority (AGCOM), discusses her recent contribution on Intermedia, the journal of the International Institute of Communications (IIC), titled “The (almost) unacknowledged revolution of AI in the media and creative industries”, with the AI lab📌 TL;DR Highlights⏲️[00:00] Intro⏲️[01:15] Q1 - AI’s impact vs. past revolutions:How does AI’s impact on media and creative industries compare to historical technological revolutions? ⏲️[05:22] Q2 - Navigating AI in media:How should we balance AI’s benefits in combating misinformation vs its potential risks? ⏲️[11:18] Q3 - Balancing copyright & AI:You state that: “[AI] and human intelligence follow [a] not dissimilar logic. So we should not use a double standard to regulate them”. What should a balanced approach to copyright in AI look like?⏲️[17:48] Wrap-up & Outro💭 Q1 - AI’s impact vs. past revolutions🗣️ The [AI] revolution (...) in the media and creative industries, as many previous ones, will probably be declared a revolution only long after it happened.🗣️ AI in the media sector[:] Its disruptive effect goes unnoticed (...), [and] the media and creative industries remain under the radar in the public debate, since they are not among the leading adoption fields.🗣️ Two of the winners of the last Pulitzer Prize for journalism admitted using AI systems in their investigation and getting so many benefits from AI.🗣️ Why the AI revolution looks like the main technological revolutions of the past? Its ability to divide [and] polarise, the public debate between enthusiasts (...) and radical opponents (...).💭 Q2 - Navigating AI in media🗣️ Every technological innovation has been accompanied by a sort of squinting effect which leads to amplifying the distorted uses to the detriment of the more abundant beneficial applications.🗣️ Demonising AI for fear of its side effects would be as if in the past we had refused to switch from the plough to the tractor for fear that the tractor could pollute or run over people and animals.🗣️ AI is not only used to produce fake news and misleading content, but also in fact checking and identifying deepfakes. It is used in fighting disinformation.🗣️ Only by taking into account opportunities and risks at the same time, we will be able to develop a balanced regulation and avoid emergency and radical responses in the wake of moral panics produced by AI misuses.🗣️ The media (...) are likely to shape our perception of the world and to guide other choices, so they should have been included in the [EU AI Act] high-risk sectors.💭 Q3 - Balancing copyright & AI🗣️ I have strong misgivings about the remuneration hypothesis[:] (...) it privileges publishers over any other content producers.🗣️ I’m not sure having different rules for the human and artificial mind makes sense. My conclusion here is that maybe it’s too early to find a solution to the copyright problems raised by AI.🗣️ Any balanced resolution must have two starting points: first, a rigorous analysis of the real value chain (...), and second, (...) [a] precise diagnosis. (...) Regulate only when there is a [real] pathology to be healed.📌 About Our Guest🎙️ Dr Elisa Giomi | Associate Professor at Roma Tre University & Commissioner of the Italian Communications Regulatory Authority (AGCOM) 𝕏 https://x.com/@elisagiomi🌐 International Institute of Communications (IIC) - The (Almost) Unacknowledged Revolution of AI in the Media and Creative Industrieshttps://iicintermedia.org/vol-52-issue-1/the-almost-unacknowledged-revolution-of-ai-in-the-media-and-creative-industries/🌐 AGCOM - Dr Elisa Giomihttps://www.agcom.it/elisa-giomiDr Elisa Giomi is an associate professor at Roma Tre University, Department of Philosophy, Communication and Performing Arts, and a commissioner of AGCOM, the Italian Communications Regulatory Authority. Professor Giomi is the author of a wide array of publications for major Italian and international publishers and peer-reviewed journals.#AI #ArtificialIntelligence #GenerativeAI
18 Jun 202418min

AI lab TL;DR | Derek Slater - What the Copyright Case Against Ed Sheeran Can Teach Us About AI
🔍 In this TL;DR episode, Derek Slater (Proteus Strategies) discusses his recent blog post on the Tech Policy Press website, titled “What the Copyright Case Against Ed Sheeran Can Teach Us About AI”, with the AI lab📌 TL;DR Highlights⏲️[00:00] Intro⏲️[01:11] Q1 - Legal boundaries & creativity:How to define the boundary between protectable expression and unprotectable building blocks in music & other creative fields?What lessons does this offer for generative AI?⏲️[05:13] Q2 - Consent vs. enclosure:What is enclosure?How can we balance it with consent in regulating AI tools?What guiding principles should policymakers follow to not stifle innovation & creativity?⏲️[09:35] Q3 - Technological impact on art:What is the long-term impact of generative AI on music & artistic expression, as other technological advances ultimately revolutionised creative industries after an initial backlash?⏲️[12:18] Wrap-up & Outro💭 Q1 - Legal boundaries & creativity🗣️ All creativity builds on the past. All songs are made up of a limited number of notes and chords available to the composers [and] to protect their combination would give Let’s Get It On an impermissible monopoly[, the judge said].🗣️ Copyright has always allowed certain uses of existing content (...) by drawing lines between protectable expression and unprotectable ideas, facts, and other elements.🗣️ Rightsholders can demand consent for some uses, but they are not allowed to enclose and cut off the basic building blocks of culture and knowledge.🗣️ Generative AI: (...) it’s a big statistical analysis of lots and lots of texts to derive rules about syntax and how different concepts are related (...) For music, it’s analysing lots and lots of music to tease out those basic building blocks.🗣️ [AI training] can’t be reduced to the simplicity of consent (...) because the question is: consent for what? (...) Deriving insights [and] uncopyrightable elements from protectable expression generally can be permissible.💭 Q2 - Consent vs. enclosure🗣️ We also recognise downsides to [copyright], (...) meaning the public can no longer freely build on and use it. (...) We’ve always had copyright protection but also limits so that enclosure (...) doesn’t go too far.🗣️ When is it unethical to stop people from (...) using basic building block[s] of language or music? Because that information, that knowledge, those cultural artefacts, ought to belong to the public.🗣️ I think from a copyright perspective, the first key principle is: is this protection necessary to encourage creativity (...)? If creativity is already booming, abundant, and would happen anyway (...) then there should not be an issue.🗣️ When we think about generative AI, these are tools for productivity, for creativity, not for piracy (...). They’re not about simply reusing the works that they were trained on in the outputs. (...) That’s considered a bug, a failure (...) and something to be avoided.🗣️ When somebody uses [an AI] tool like Suno or Udio to create a new song, that’s in line with copyright’s purpose. (...) It crosses the line (...) where that output is directly substituting, reusing that communicative expression embodied in some specific work.💭 Q3 - Technological impact on art🗣️ One way to think about [AI] is sort of like the synthesizer, computer-generated graphics or Photoshop, where, at first, people said, this is not music, [or] art, and over time, it became integrated into artistic processes in a variety of ways.🗣️ [2023] Oscar winner, ’Everything Everywhere All at Once’, used the generative AI tool Runway to edit one of its famous scenes. Nobody knew that was generative AI at the time. Nobody said ‘Oh, this is a generative AI movie’, but it was part of their artistic process.🗣️ It’s acknowledged that generative AI is driving an abundance of creativity. (...) So that fundamentally is not at odds with (...) copyright. I think most of the concerns that people have aren’t really copyright problems.🗣️ A lot of creators are worried about how the benefits of [AI] technology will really be spread. Will they be concentrated among a few big companies or benefit [many], including creators? (...) [Those concerns] demand solutions (...) beyond copyright.🗣️ As a fan, I feel like we are in a golden era [with AI]. Now, we just need to make really sure those benefits are widely shared.📌 About Our Guest🎙️ Derek Slater | Co-Founder of Proteus Strategies 𝕏 https://x.com/@derekslater🌐 Tech Policy Press blog posthttps://www.techpolicy.press/what-the-copyright-case-against-ed-sheeran-can-teach-us-about-ai/🌐 Proteus Strategieshttps://www.proteusstrategies.com/ Derek Slater is a tech policy strategist focused on media, communications, and information policy. He is the co-founder of Proteus Strategies and previously worked at Google and at the Electronic Frontier Foundation on issues related to access to information, content regulation, and online safety.
30 Mai 202413min

AI lab - AI in Action | Episode 01: AI History
We are kickstarting our AI in Action series by diving headfirst into the key milestones that led to the gradual deployment of Artificial Intelligence, or AI for short. You might think it's some shiny new invention, looking at all the recent media coverage about robots taking over your jobs and writing bad poetry. But hold on to your Roomba, because AI has been around longer than your grandma’s pocket calculator.Read more & grab the infographic of this timeline here:https://informationlabs.org/ai-lab-ai-in-action-episode-01-ai-history/
7 Mai 20249min




















