It’s now possible to run the first open source alternative to OpenAI’s ChatGPT on your laptop, but good luck with that.
PaLM + RLHF, a text-generating model that functions similarly to ChatGPT, was released this week by Philip Wang, the developer in charge of deconstructing closed-sourced AI systems like Meta’s Make-A-Video. In order to construct a system that can perform nearly every action that ChatGPT can, including email writing and code suggestion, the system combines PaLM, a huge language model from Google, and a method known as Reinforcement Learning with Human Feedback, or RLHF, for short.
PaLM + RLHF, however, is not pre-trained. In other words, the system hasn’t received the essential training using example data from the web for it to truly function. That would involve generating gigabytes of text from which the model can learn and finding hardware robust enough to manage the training burden. Downloading PaLM + RLHF won’t automatically install a ChatGPT-like experience.
PaLM + RLHF is essentially a statistical word prediction tool like ChatGPT. PaLM + RLHF learns how frequently words are to appear based on patterns like the semantic context of surrounding text when fed a massive amount of instances from training data, such as posts from Reddit, news articles, and e-books.
Reinforcement Learning with Human Feedback, a method intended to better align language models with what users want them to achieve, is a secret sauce shared by ChatGPT and PaLM + RLHF. RLHF entails fine-tuning a language model using a dataset that contains prompts (such as “Explain machine learning to a six-year-old”) matched with what human volunteers anticipate the model to say (such as “Machine learning is a form of AI…”). PaLM is the language model used in PaLM + RLHF. After feeding the aforementioned prompts into the refined model, which produces a number of responses, the volunteers rank each response from best to worst. The rankings are then used to train a “reward model” that uses the replies from the initial model and arranges them according to preference to find the best responses to a given prompt.
The procedure of gathering training data is expensive. Additionally, training is not cheap. PaLM has 540 billion parameters, or the components of the language model that were learned from the training set. According to a 2020 study, it might cost up to $1.6 million to create a text-generating model with only 1.5 billion parameters. And it took 384 Nvidia A100 GPUs, each of which costs thousands of dollars, three months to train the open source model Bloom, which contains 176 billion parameters.
Running a trained model of the size of PaLM + RLHF is also not simple. A dedicated PC with roughly eight A100 GPUs is needed for Bloom. The cost of running OpenAI’s text-generating GPT-3 on a single Amazon Web Services instance, which contains over 175 billion parameters, is estimated to be about $87,000 per year via back-of-the-envelope arithmetic.
AI researcher Sebastian Raschka notes in a LinkedIn post about PaLM + RLHF that scaling up the required dev workflows may also be difficult. “You still need to deal with infrastructure and have a software framework that can handle that, even if someone gives you with 500 GPUs to train this model,” he said. Of course that’s possible, but right now it takes a lot of work (we’re working on frameworks to make it easier, but it’s still not easy).
All of this is to suggest that PaLM + RLHF isn’t going to replace ChatGPT right now — unless a well-funded venture (or person) bothers to teach and make it accessible to the general public.
“fasting” is a term used to describe a group of people who. The first ChatGPT-like AI model that has been trained with human feedback will be made available by CarperAI in collaboration with the open AI research organization EleutherAI, the firms Scale AI and Hugging Face.
A attempt to reproduce ChatGPT using the most recent machine learning methods is being led by LAION, the organization that provided the first dataset used to train Stable Diffusion. LAION’s ambitious goal is to create a “assistant of the future” that “does meaningful work, accesses APIs, dynamically studies information, and much more” in addition to writing emails and cover letters. It’s just getting started. But a few weeks ago, a GitHub page with project resources went live.