Ziaul Kamal’s Post

View profile for Ziaul Kamal, graphic

Coder Enthusias

Pulumi Templates for GenAI Stacks: Pinecone, LangChain First To build a Generative AI application, you typically need at least two components to start with, a Large Language Model (LLM) and a vector data store. You probably need some sort of frontend component as well, such as a chatbot. Organizations jumping into the GenAI space are now facing an orchestration challenge with GenAI. They find that moving these components from the developer’s laptop to the production environment can be error-prone and time-consuming. To ease deployments, Infrastructure as Code (IaC) software provider Pulumi has introduced “providers,” or templates, for two essential GenAI tools, namely the Pinecone vector database and a version of the LangChain framework for building LLMs. “We find a lot of the tools out there, like LangChain, are great for local development. But then when you want to go into production, it’s left as a DIY exercise,” said Joe Duffy, CEO and co-founder of Pulumi, in an interview with TNS. “And it’s very challenging because you want to architect for infinite scale so that as you see success with your application, you’re able to scale to meet that demand. And that’s not very easy to do.” Specifically, Pulumi is supporting the serverless version of Pinecone on AWS, which was unveiled in January, and support for LangChain comes through LangServe, a container management service built on Amazon ECS. The two templates join a portfolio that covers over 150 cloud and SaaS service providers, including many others used in the GenAI space, such as Vercel Next.js for the frontend and Apache Spark. In addition to the templates themselves, Pulumi also mapped out a set of reference architectures that use Pinecone and LangChain. How to Build a GenAI Stack Using IaC The idea is that the AI professional, who may not have operations experience, can define and orchestrate an ML stack with Pulumi, using Python another language. As an IaC solution, Pulumi provides a way to declaratively define an infrastructure. Unlike other IaC approaches, Pulumi allows the developer to build out your environment using any one of a number of programming languages, such as Python, Go, Java and TypeScript. deployment engine then can provision the defined environment, and even check to ensure that the operational state stays in sync with the defined state. The AI Gen reference architectures have been designed with best practices in mind, Duffy said. “A lot of the challenge is how to make this scalable, scalable across regions and scalable across the subnets, and networks. And so this blueprint is built for configurable scale.” This is not Pulumi’s first foray into managing AI infra. The company has already developed modules for AWS SageMaker and Microsoft’s OpenAI Azure service. There is also a blueprint for deploying an LLM from Hugging Face on Docker, Azure, or Runpod. Of course, the company has plans to further expand the roster going forward. “We’re seeing a lot...

Adam Burges

Sales Partner for Companies with a Proven Sales Process

8mo

Excited to see how this will simplify GenAI stack deployments!

Like
Reply

To view or add a comment, sign in

Explore topics