

MosaicML debuts inference service to make generative AI deployment affordable
source link: https://venturebeat.com/ai/mosaicmls-inference-service-to-make-generative-ai-affordable/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

MosaicML debuts inference service to make generative AI deployment affordable

Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
California-based MosaicML, a provider of generative AI infrastructure, has launched a fully-managed inference service to help enterprises easily and affordably deploy generative AI models.
The offering comes as the demand for large language models (LLMs) continues to grow across industries. According to MosaicML, it can make it possible to serve LLMs for up to 15 times less than other comparable services in the market.
The launch expands MosaicML’s capabilities, making it a complete tool for generative AI training and deployment. Prior to this, the company had largely focused on providing the software infrastructure for training generative AI models.
MosaicML inference: How does it help?
Given the rise of LLMs like ChatGPT, enterprises have grown eager to implement generative AI capabilities into their applications and products. However, owing to the privacy challenges (data going to third party) and high costs involved with building and deploying such models, the task has not exactly been a cakewalk.
Event
Transform 2023
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
With the new inference service, MosaicML is simplifying deployment by giving enterprises the option to either query their own custom-built LLMs or a curated selection of open-source models, including Instructor-XL, Databricks’ Dolly, GPT NeoX and MosaicML foundation series models.
At the core, the service includes two separate tiers: starter and enterprise. The starter tier offers open-source models curated and hosted by MosaicML as API endpoints for easy starts when adding generative AI to applications. They can be deployed as is.
The enterprise tier goes a step further, allowing teams to deploy any model they want, including custom ones developed to address specific use cases in their own network (VPC). This way, inference data never leaves the secured environment of the user’s infrastructure, ensuring full privacy and security.
And, it saves money
More importantly, with its low latency and high hardware utilization capabilities, MosaicML Inference can also be several times cheaper at deploying models than other comparable offerings.
In a cost assessment, MosaicML said the starter edition of its inference service hosted curated text completion and embedding models for four times less than OpenAI’s offering, while the enterprise tier was found to be 15 times cheaper. All measurements were taken on 40GB NVIDIA A100s with standard 512-token input sequences or 512×512 images, the company added.

While MosaicML didn’t share the names of the companies using the new inference service, CO Naveen Rao did note that customers are already starting to witness results with the offering.
“A publicly traded customer of ours in the financial compliance space is using the MosaicML inference service to deploy their custom GPT trained from scratch on MosaicML,” Rao told VentureBeat. “This customer experienced north of 10x inference savings compared to alternate providers. TCO (total cost of ownership) for their first model was less than $100,000.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.
Recommend
-
7
AI ‘a lens through which we can move data,...
-
6
Cloudflare R2 and MosaicML enable training LLMs on any compute, anywhere in the world, with zero switching costs 05/16/2023
-
3
OctoML debuts self-optimizing compute service for generative AI applications
-
6
MosaicML challenges OpenAI with its new open-source language model
-
2
This Week in AI: Databricks’ Acquisition of MosaicML
-
5
AIGC领域最大收购:Databricks 13亿美元买下MosaicML,成立仅2年员工60人 硅星人 发表于 2023年06月27日...
-
4
Databricks is acquiring MosaicML for a jaw-dropping $1.3 billion
-
9
Databricks and MosaicML CEOs say $1.3 billion deal is about helping enterprises create AI IP
-
5
AMD AI chips are nearly as fast as Nvidia's, MosaicML says Software is key in LLM training By
-
4
2023-07-25 08:05 AI 公司 MosaicML 推出 70 亿参数模型 MPT-7B-8K,号称“一次处理 8000 字长文本、可商用” 据 IT 之家 7 月 25 日消息,AI 创业公司 MosaicML 近日发布了其 70 亿参数模型 MPT-7B-8K。据悉,该模型一次可以处理 80...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK