Home / News / Artificial Intelligence / OpenAI has entered into an agreement to train artificial intelligence (AI) using data from Reddit

OpenAI has entered into an agreement to train artificial intelligence (AI) using data from Reddit

OpenAI has entered into an agreement with Reddit to utilize the data from the social news platform for the purpose of training artificial intelligence models.

OpenAI announced in a blog post on their press relations site that their partnership with Reddit will grant them access to real-time, structured, and distinctive content such as posts and replies. This access will enable their tools and models to gain a deeper understanding of the content and effectively present it. OpenAI’s widely used conversational AI will now include Reddit content. OpenAI and Reddit will collaborate to introduce new, unspecified AI-powered features for the benefit of Reddit users and moderators.

OpenAI will establish a partnership with Reddit for advertising purposes.

OpenAI stated in the post that Reddit will utilize OpenAI’s platform of AI models to actualize its potent vision. By leveraging LLMs (Language Models), ML (Machine Learning), and AI (Artificial Intelligence), Reddit is able to enhance the user experience for all users.

OpenAI has entered into multiple licensing agreements with various content providers, including stock media libraries and news publishers. However, what sets this one apart is that Sam Altman, the CEO of OpenAI, holds an 8.7% ownership stake in Reddit, making him the third most significant shareholder. Additionally, Altman previously served as a member of the company’s board of directors.

OpenAI’s press release attempts to dissuade scrutiny by stating that although Altman still holds shares in Reddit, the partnership was primarily overseen by OpenAI’s Chief Operating Officer (COO) Brad Lightcap and approved by the organization’s independent board of directors.

Reddit has strategically prioritized data licensing agreements as a key component of its growth strategy while operating as a publicly traded company.

Reddit disclosed in its IPO prospectus that it has contractual arrangements to license its data to customers, such as Google, with a total value exceeding $200 million. Reddit revealed a remarkable 450% increase in non-advertising revenue over the previous year in its first earnings report as a publicly traded company, primarily due to these agreements.

The value of Reddit stock increased by 11% during after-hours trading in response to the news of the OpenAI agreement.

“The paradox I observe is that, as the proportion of machine-generated content on the internet grows, there is a rising value placed on content created by actual individuals,” stated Steve Huffman, the CEO of Reddit, during the company’s earnings call in March. “Furthermore, we possess close to twenty years of genuine dialogue.”

Reddit’s platform, with its vast collection of over 1 billion posts and 16 billion comments, is a valuable resource for generative AI companies. These companies utilize the platform’s extensive content, including text and images, to train their models and generate new, comparable content.

However, the company may encounter resistance from users who are apprehensive about how it is generating revenue from their data.

An illuminating example is Stack Overflow, the Q&A platform for software developers, which recently entered into a partnership with OpenAI to provide data for training the latter’s models. As an act of protest, certain users removed their highly-rated responses to inquiries within the community. However, Stack Overflow reinstated the deleted posts and imposed a ban on those users, asserting that their actions were not in accordance with its terms of service.

Reddit has already expressed its dissatisfaction with a previous effort to provide Reddit users with more authority over their own data.

Vana, a blockchain-based startup, aims to establish a data “DAO” (Digital Autonomous Organization) that enables Reddit users to collectively aggregate their data and collectively determine the utilization or sale of the combined data. Reddit has prohibited Vana’s subreddit, which was created for the purpose of discussing the DAO. Reddit accused the company of “exploiting” its data export controls.

About Chambers

Check Also

The Air Force has abandoned its attempt to install a directed-energy weapon on a fighter jet, marking another failure for airborne lasers

The U.S. military’s most recent endeavor to create an airborne laser weapon, designed to safeguard …