Stable Diffusion’s creator releases new open-source language model
Joining the likes of Google and OpenAI, Stability AI has now released a new open-source language model StableLM. This model will be able to generate text and code and support several downstream applications, the company said.
The company which last year released the text-to-image generative artificial intelligence model Stable Diffusion, said that its new language model is available for free for commercial and research purposes.
Stable AI, with its new model, also seeks to demonstrate that small models can deliver high performance with appropriate training. The alpha version of the StableLM model is currently available in 3 billion and 7 billion parameters; Stable AI will follow it up with 15 billion and 65 billion in the coming days. StableLM is trained on an experimental dataset that is built on The Pile but is three times larger with 1.5 trillion tokens of content. The Pile contains data from sources like Wikipedia, Stack Exchange, and PubMed.
“The richness of this dataset gives StableLM surprisingly high performance in conversational and coding tasks, despite its small size of 3 to 7 billion parameters (by comparison, GPT-3 has 175 billion parameters),” the company said in its blog.
With StableLM, Stable AI expands on the open-source language models that it has already worked on with the non-profit EleutherAI. Stable AI said that the goal of models like StableLM is towards ‘transparent, accessible, and supportive’ AI technology. The company’s Stable Diffusion model was also made available to all through a public demo, software beta, and a full download of the model. Developers were able to leverage this to come up with several integrations.
Along with StableLM, Stable AI is also releasing a set of research models that are ‘instruction fine-tuned’. These fine-tuned models use a combination of open-source datasets for conversational agents like Alpaca, GPT4All, Dolly, ShareGPT, and HH.
This month, San Francisco based Databricks released Dolly and Dolly 2.0, its large language models. Dolly 2.0 has been fine-tuned on human-generated instruction following dataset, crowdsourced among Databricks employees. The company claims that it is the first open-source, instruction-following LLM that is fine-tuned on a freely available dataset. Databricks also said that the model is open for commercial applications without paying for API access or data sharing with third parties.