Pynchon works among set to train AI systems

rich richard.romeo at gmail.com
Fri Sep 29 15:17:45 UTC 2023


FYI

https://www.theatlantic.com/technology/archive/2023/09/books3-database-generative-ai-training-copyright-infringement/675363/

This summer, I acquired a data set of more than 191,000 books that were
used without permission to train generative-AI systems by Meta, Bloomberg,
and others. I wrote in *The Atlantic *about
<https://www.theatlantic.com/technology/archive/2023/08/books3-ai-meta-llama-pirated-books/675063/>
how
the data set, known as “Books3,” was based on a collection of pirated
ebooks, most of them published in the past 20 years.


   - Against the Day
   - Al Límite (spanish Edition)
   - Bleeding Edge: A Novel
   - Inherent Vice
   - La subasta del lote 49 (Andanzas) (Spanish Edition)
   - Mason & Dixon
   - Mason & Dixon
   - Slow Learner
   - The Crying of Lot 49
   - Vente à la criée du lot 49
   - Vicio propio
   - Vineland


More information about the Pynchon-l mailing list