Amazon enters generative AI. Instead of building AI models, it’s recruiting third parties to host them on AWS.
Amazon Bedrock lets developers build generative AI apps using pretrained models from startups like AI21 Labs, Anthropic, and Stability AI. Bedrock offers AWS-trained Titan FMs (foundation models) in a “limited preview.”
“Applying machine learning to the real world—solving real business problems at scale—is what we do best,” AWS VP of generative AI Vasi Philomin told TechCrunch in a phone interview. “Generative AI can reinvent every application.”
AWS’ recent partnerships with generative AI startups and growing investments in generative AI app tech hinted at Bedrock’s launch.
In March, Hugging Face and AWS collaborated to bring Stability AI’s text-generating models to AWS. AWS announced a generative AI accelerator for startups and a partnership with Nvidia to build “next-generation” AI model training infrastructure.
Custom and bedrock models
Bedrock is Amazon’s most aggressive move into the generative AI market, which Grand View Research estimates could be worth $110 billion by 2030.
Bedrock lets AWS customers use APIs to access AI models from multiple providers, including AWS. Amazon hasn’t announced pricing, so details are unclear. The company stressed that Bedrock is for large customers building “enterprise-scale” AI apps, distinguishing it from Replicate and Google Cloud and Azure.
AWS’ reach or revenue sharing may have enticed Bedrock’s generative AI model vendors. Amazon did not disclose model licensing or hosting terms.
AI21 Labs’ multilingual Jurassic-2 family is one of Bedrock’s third-party models. Anthropic’s Bedrock model, Claude, can talk and process text. Stability AI’s text-to-image Bedrock-hosted models, including Stable Diffusion, can create art, logos, and graphics.
Amazon’s Titan FM family includes a text-generating model and an embedding model, with more likely to come. The text-generating model, like OpenAI’s GPT-4 but less powerful, can write blog posts, emails, summaries, and database extracts. The embedding model converts words and phrases into semantically meaningful numerical representations, called embeddings. Philomin compares it to an Amazon.com search model.
AWS customers can customize any Bedrock model by pointing the service at a few labeled examples in Amazon S3, Amazon’s cloud storage plan. 20 is enough. Amazon claims the models are trained without customer data.
“At AWS, we’ve democratized machine learning and made it accessible to anyone who wants to use it,” Philomin said. “Amazon Bedrock is the easiest way to build and scale foundation model-based generative AI applications.”
Given the legal uncertainties surrounding generative AI, one wonders how many customers will bite.
Azure OpenAI Service, Microsoft’s enterprise-focused generative AI model suite, has been successful. Microsoft reported 1,000 Azure OpenAI Service customers in March.
However, plaintiffs are suing OpenAI and Stability AI for using copyrighted data, mostly art, to train generative models. (Generative AI models “learn” to create art, code, and more by “training” on randomly scraped web images and text.) Another case seeks to determine whether code-generating models without attribution or credit can be commercialized, and an Australian mayor has threatened to sue OpenAI for ChatGPT’s inaccuracies.
Philomin’s refusal to reveal Amazon’s Titan FM family’s training data didn’t inspire confidence. Instead, he stressed that the Titan models were built to detect and remove “harmful” content in the data AWS customers provide for customization, reject “inappropriate” user input, and filter outputs containing hate speech, profanity, and violence.
ChatGPT shows that even the best filters can be bypassed. Prompt injection attacks against ChatGPT and similar models have been used to write malware, find open source code exploits, and generate offensive sexist, racist, and misinformational content. (Generative AI models amplify biases in training data or make things up if they run out of relevant data.)
Philomin dismissed them.
“We’re committed to responsible use of these technologies,” he said. “We’re watching the regulatory landscape… We have many lawyers helping us decide which data to use and which not to use.”
Despite Philomin’s assurances, brands may not want to be responsible for all that could go wrong. (AWS customers, AWS, or the model’s creator could be liable in a lawsuit.) Individual customers might—especially if it’s free.
GA launches CodeWhisperer, Trainium, and Inferentia2.
Amazon gave developers unlimited access to CodeWhisperer, its AI-powered code-generation service, today as part of its generative AI push.
CodeWhisperer hasn’t caught on, according to Amazon’s move. GitHub Copilot, its main competitor, had over a million users in January, including thousands of enterprise customers. CodeWhisperer’s Professional Tier launch aims to catch up on the corporate side. CodeWhisperer Professional Tier adds single sign-on with AWS Identity and Access Management integration and higher security vulnerability scanning limits.
In response to Copilot, the AWS IDE Toolkit and IDE extensions launched CodeWhisperer in late June. CodeWhisperer can autocomplete entire functions in languages like Java, JavaScript, and Python with just a comment or a few keystrokes after training on billions of lines of open source code, Amazon’s codebase, and public forum documentation and code.
CodeWhisperer now supports Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL, and Scala and highlights and optionally filters the license of functions it suggests that resemble snippets in its training data.
GitHub highlights to avoid Copilot legal issues. Success depends on time.
Philomin said these tools make developers more productive. Developers struggle to stay current. tools like this help them worry less.”
In less controversial news, Amazon launched Elastic Cloud Compute (EC2) Inf2 instances in general availability today, powered by its AWS Inferentia2 chips, which were previewed last year at re:Invent. Inf2 instances accelerate AI runtimes by improving throughput and latency for better inference price performance.
Amazon also released Amazon EC2 Trn1n instances powered by AWS Trainium, Amazon’s AI training chip, today. Amazon claims they offer 1600 Gbps of network bandwidth and 20% better performance than Trn1 for large, network-intensive models.
Inf2 and Trn1n compete with Google and Microsoft products like TPU chips for AI training.
“AWS is the best cloud infrastructure for generative AI,” Philomin said. Customers need the right costs for these models. It’s one reason many customers haven’t produced these models.”
Generative AI reportedly brought Azure down. Amazon, too? It depends.