by Theleaders-Online | January 4, 2024 2:00 pm
REDWOOD CITY, Calif., Jan. 4, 2024 /PRNewswire/ — FriendliAI[1], a leader in inference serving for generative AI, announced the launch of Friendli Serverless Endpoints[2] today for accessible development with generative AI models. This service removes the technical barriers of managing the underlying infrastructure, putting the power of cutting-edge generative AI models directly into the hands of developers, data scientists, and businesses of all sizes.
[3]
Friendli Serverless Endpoints
“Building the future of generative AI requires democratizing access to the technology,” says Byung-Gon Chun, CEO of FriendliAI. “With Friendli Serverless Endpoints, we’re removing the complicated infrastructure and GPU optimization hurdles that hold back innovation. Now, anyone can seamlessly integrate state-of-the-art models like Llama 2 and Stable Diffusion into their workflows at low costs and high speeds, unlocking incredible possibilities for text generation, image creation, and beyond.”
Users can seamlessly integrate open-source generative AI models into their applications with granular control at the per-token or per-step level, enabling need-specific resource usage optimizations. Friendli Serverless Endpoints comes pre-loaded with popular models like Llama 2, CodeLlama, Mistral, and Stable Diffusion.
Friendli Serverless Endpoints[2] provides per-token billing at the lowest price on the market, at $0.2 per million tokens for the Llama 2 13B model, and $0.8 per million tokens for the Llama 2 70B model. Friendli Serverless Endpoints[2] provides query responses at 2-4x faster latency compared to other leading solutions that use vLLM, ensuring a smooth and responsive generative AI experience. This impressive pricing and speed is achieved through the company’s Friendli Engine[4], an optimized serving engine that reduces the number of GPUs required for serving by up to 6-7x compared to traditional solutions.
For those seeking dedicated resources and custom model compatibility, FriendliAI offers Friendli Dedicated Endpoints[5] through cloud-based dedicated GPU instances, as well as Friendli Container[6] through Docker. This flexibility ensures the perfect solution for a variety of generative AI ambitions.
“We’re on a mission to make open-source generative AI models fast and affordable,” says Chun. “The Friendli Engine, along with our new Friendli Serverless Endpoints, is a game-changer. We’re thrilled to welcome new users and make generative AI more accessible and economical–advancing our mission to democratize generative AI.”
Start Your Generative Journey Today: FriendliAI[1] is committed to fostering a thriving ecosystem for generative AI innovation. Visit [7]https://friendli.ai/try-friendli/[8] to sign up for Friendli Serverless Endpoints and unlock the transformative power of generative AI today.
About FriendliAI
FriendliAI[1] is a leading provider of cutting-edge inference serving for generative AI. Our mission is to empower organizations to leverage the full potential of their generative models with ease and cost-efficiency. Learn more at friendli.ai[9].
Media Contact
Sujin Oh
+82-2-889-8020
press@friendli.ai[10]
Source URL: https://theleaders-online.com/friendliai-unveils-serverless-endpoints-for-widespread-affordable-access-to-open-source-generative-ai-models/
Copyright ©2026 The Leaders Online unless otherwise noted.