Information
# ️ LangServe
[](https://github.com/langchain-ai/langserve/releases)
[](https://pepy.tech/project/langserve)
[](https://github.com/langchain-ai/langserve/issues)
[](https://discord.com/channels/1038097195422978059/1170024642245832774)
> [!WARNING]
> We recommend using LangGraph Platform rather than LangServe for new projects.
>
> Please see the [LangGraph Platform Migration Guide](./MIGRATION.md) for more information.
>
> We will continue to accept bug fixes for LangServe from the community; however, we
> will not be accepting new feature contributions.
## Overview
[LangServe](https://github.com/langchain-ai/langserve) helps developers
deploy \`LangChain\` [runnables and chains](https://python.langchain.com/docs/expression_language/)
as a REST API.
This library is integrated with [FastAPI](https://fastapi.tiangolo.com/) and
uses [pydantic](https://docs.pydantic.dev/latest/) for data validation.
In addition, it provides a client that can be used to call into runnables deployed on a
server.
A JavaScript client is available
in [LangChain.js](https://js.langchain.com/docs/ecosystem/langserve).
## Features
- Input and Output schemas automatically inferred from your LangChain object, and
enforced on every API call, with rich error messages
- API docs page with JSONSchema and Swagger (insert example link)
- Efficient \`/invoke\`, \`/batch\` and \`/stream\` endpoints with support for many
concurrent requests on a single server
- \`/stream_log\` endpoint for streaming all (or some) intermediate steps from your
chain/agent
- **new** as of 0.0.40, supports \`/stream_events\` to make it easier to stream without needing to parse the output of \`/stream_log\`.
- Playground page at \`/playground/\` with streaming output and intermediate steps
- Built-in (optional) tracing to [LangSmith](https://www.langchain.com/langsmith), just
add your API key (see [Instructions](https://docs.smith.langchain.com/))
- All built with battle-tested open-source Python libraries like FastAPI, Pydantic,
uvloop and asyncio.
- Use the client SDK to call a LangServe server as if it was a Runnable running
locally (or call the HTTP API directly)
- [LangServe Hub](https://github.com/langchain-ai/langchain/blob/master/templates/README.md)
## ️ LangGraph Compatibility
LangServe is designed to primarily deploy simple Runnables and work with well-known primitives in langchain-core.
If you need a deployment option for LangGraph, you should instead be looking at [LangGraph Cloud (beta)](https://langchain-ai.github.io/langgraph/cloud/) which will
be better suited for deploying LangGraph applications.
## Limitations
- Client callbacks are not yet supported for events that originate on the server
- Versions of LangServe <= 0.2.0, will not generate OpenAPI docs properly when using Pydantic V2 as Fast API does not support [mixing pydantic v1 and v2 namespaces](https://github.com/tiangolo/fastapi/issues/10360).
See section below for more details. Either upgrade to LangServe>=0.3.0 or downgrade Pydantic to pydantic 1.
## Security
- Vulnerability in Versions 0.0.13 - 0.0.15 -- playground endpoint allows accessing
arbitrary files on
server. [Resolved in 0.0.16](https://github.com/langchain-ai/langserve/pull/98).
## Installation
For both client and server:
\`\`\`bash
pip install "langserve[all]"
\`\`\`
or \`pip install "langserve[client]"\` for client code,
and \`pip install "langserve[server]"\` for server code.
## LangChain CLI ️
Use the \`LangChain\` CLI to bootstrap a \`LangServe\` project quickly.
To use the langchain CLI make sure that you have a recent version of \`langchain-cli\`
installed. You can install it with \`pip install -U langchain-cli\`.
## Setup
**Note**: We use \`poetry\` for dependency management. Please follow poetry [doc](https://python-poetry.org/docs/) to learn more about it.
### 1. Create new app using langchain cli command
\`\`\`sh
langchain app new my-app
\`\`\`
### 2. Define the runnable in add_routes. Go to server.py and edit
\`\`\`sh
add_routes(app. NotImplemented)
\`\`\`
### 3. Use \`poetry\` to add 3rd party packages (e.g., langchain-openai, langchain-anthropic, langchain-mistral, etc).
\`\`\`sh
poetry add [package-name] // e.g \`poetry add langchain-openai\`
\`\`\`
### 4. Set up relevant env variables. For example,
\`\`\`sh
export OPENAI_API_KEY="sk-..."
\`\`\`
### 5. Serve your app
\`\`\`sh
poetry run langchain serve --port=8100
\`\`\`
## Examples
Get your LangServe instances started quickly with the [examples](https://github.com/langchain-ai/langserve/tree/main/examples)
directory.
| Description | Links |
| :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **LLMs** Minimal example that reserves OpenAI and Anthropic chat models. Uses async, supports batching and streaming. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/llm/server.py), [client](https://github.com/langchain-ai/langserve/blob/main/examples/llm/client.ipynb) |
| **Retriever** Simple server that exposes a retriever as a runnable. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/retrieval/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/retrieval/client.ipynb) |
| **Conversational Retriever** A [Conversational Retriever](https://python.langchain.com/docs/expression_language/cookbook/retrieval#conversational-retrieval-chain) exposed via LangServe | [server](https://github.com/langchain-ai/langserve/tree/main/examples/conversational_retrieval_chain/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/conversational_retrieval_chain/client.ipynb) |
| **Agent** without **conversation history** based on [OpenAI tools](https://python.langchain.com/docs/modules/agents/agent_types/openai_functions_agent) | [server](https://github.com/langchain-ai/langserve/tree/main/examples/agent/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/agent/client.ipynb) |
| **Agent** with **conversation history** based on [OpenAI tools](https://python.langchain.com/docs/modules/agents/agent_types/openai_functions_agent) | [server](https://github.com/langchain-ai/langserve/blob/main/examples/agent_with_history/server.py), [client](https://github.com/langchain-ai/langserve/blob/main/examples/agent_with_history/client.ipynb) |
| [RunnableWithMessageHistory](https://python.langchain.com/docs/expression_language/how_to/message_history) to implement chat persisted on backend, keyed off a \`session_id\` supplied by client. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/chat_with_persistence/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/chat_with_persistence/client.ipynb) |
| [RunnableWithMessageHistory](https://python.langchain.com/docs/expression_language/how_to/message_history) to implement chat persisted on backend, keyed off a \`conversation_id\` supplied by client, and \`user_id\` (see Auth for implementing \`user_id\` properly). | [server](https://github.com/langchain-ai/langserve/tree/main/examples/chat_with_persistence_and_user/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/chat_with_persistence_and_user/client.ipynb) |
| [Configurable Runnable](https://python.langchain.com/docs/expression_language/how_to/configure) to create a retriever that supports run time configuration of the index name. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/configurable_retrieval/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/configurable_retrieval/client.ipynb) |
| [Configurable Runnable](https://python.langchain.com/docs/expression_language/how_to/configure) that shows configurable fields and configurable alternatives. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/configurable_chain/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/configurable_chain/client.ipynb) |
| **APIHandler** Shows how to use \`APIHandler\` instead of \`add_routes\`. This provides more flexibility for developers to define endpoints. Works well with all FastAPI patterns, but takes a bit more effort. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/api_handler_examples/server.py) |
| **LCEL Example** Example that uses LCEL to manipulate a dictionary input. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/passthrough_dict/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/passthrough_dict/client.ipynb) |
| **Auth** with \`add_routes\`: Simple authentication that can be applied across all endpoints associated with app. (Not useful on its own for implementing per user logic.) | [server](https://github.com/langchain-ai/langserve/tree/main/examples/auth/global_deps/server.py) |
| **Auth** with \`add_routes\`: Simple authentication mechanism based on path dependencies. (No useful on its own for implementing per user logic.) | [server](https://github.com/langchain-ai/langserve/tree/main/examples/auth/path_dependencies/server.py) |
| **Auth** with \`add_routes\`: Implement per user logic and auth for endpoints that use per request config modifier. (**Note**: At the moment, does not integrate with OpenAPI docs.) | [server](https://github.com/langchain-ai/langserve/tree/main/examples/auth/per_req_config_modifier/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/auth/per_req_config_modifier/client.ipynb) |
| **Auth** with \`APIHandler\`: Implement per user logic and auth that shows how to search only within user owned documents. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/auth/api_handler/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/auth/api_handler/client.ipynb) |
| **Widgets** Different widgets that can be used with playground (file upload and chat) | [server](https://github.com/langchain-ai/langserve/tree/main/examples/widgets/chat/tuples/server.py) |
| **Widgets** File upload widget used for LangServe playground. | [server](https://github.com/langchain-ai/langserve/tree/main/examples/file_processing/server.py), [client](https://github.com/langchain-ai/langserve/tree/main/examples/file_processing/client.ipynb) |
## Sample Application
### Server
Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain
that uses
the Anthropic model to tell a joke about a topic.
\`\`\`python
#!/usr/bin/env python
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes
app = FastAPI(
title="LangChain Server",
version="1.0",
description="A simple api server using Langchain's Runnable interfaces",
)
add_routes(
app,
ChatOpenAI(model="gpt-3.5-turbo-0125"),
path="/openai",
)
add_routes(
app,
ChatAnthropic(model="claude-3-haiku-20240307"),
path="/anthropic",
)
model = ChatAnthropic(model="claude-3-haiku-20240307")
prompt = ChatPromptTemplate.from_template("tell me a joke about \{topic\}")
add_routes(
app,
prompt | model,
path="/joke",
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="localhost", port=8000)
\`\`\`
If you intend to call your endpoint from the browser, you will also need to set CORS headers.
You can use FastAPI's built-in middleware for that:
\`\`\`python
from fastapi.middleware.cors import CORSMiddleware
# Set all CORS enabled origins
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
expose_headers=["*"],
)
\`\`\`
### Docs
If you've deployed the server above, you can view the generated OpenAPI docs using:
> ️ If using LangServe <= 0.2.0 and pydantic v2, docs will not be generated for _invoke_, _batch_, _stream_,
> _stream_log_. See [Pydantic](#pydantic) section below for more details.
> To resolve please upgrade to LangServe 0.3.0.
\`\`\`sh
curl localhost:8000/docs
\`\`\`
make sure to **add** the \`/docs\` suffix.
> ️ Index page \`/\` is not defined by **design**, so \`curl localhost:8000\` or visiting
> the URL
> will return a 404. If you want content at \`/\` define an endpoint \`@app.get("/")\`.
### Client
Python SDK
\`\`\`python
from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable
openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/joke/")
joke_chain.invoke(\{"topic": "parrots"\})
# or async
await joke_chain.ainvoke(\{"topic": "parrots"\})
prompt = [
SystemMessage(content='Act like either a cat or a parrot.'),
HumanMessage(content='Hello!')
]
# Supports astream
async for msg in anthropic.astream(prompt):
print(msg, end="", flush=True)
prompt = ChatPromptTemplate.from_messages(
[("system", "Tell me a long story about \{topic\}")]
)
# Can define custom chains
chain = prompt | RunnableMap(\{
"openai": openai,
"anthropic": anthropic,
\})
chain.batch([\{"topic": "parrots"\}, \{"topic": "cats"\}])
\`\`\`
In TypeScript (requires LangChain.js version 0.0.166 or later):
\`\`\`typescript
import \{ RemoteRunnable \} from "@langchain/core/runnables/remote";
const chain = new RemoteRunnable(\{
url: \`http://localhost:8000/joke/\`,
\});
const result = await chain.invoke(\{
topic: "cats",
\});
\`\`\`
Python using \`requests\`:
\`\`\`python
import requests
response = requests.post(
"http://localhost:8000/joke/invoke",
json=\{'input': \{'topic': 'cats'\}\}
)
response.json()
\`\`\`
You can also use \`curl\`:
\`\`\`sh
curl --location --request POST 'http://localhost:8000/joke/invoke' \
--header 'Content-Type: application/json' \
--data-raw '\{
"input": \{
"topic": "cats"
\}
\}'
\`\`\`
## Endpoints
The following code:
\`\`\`python
...
add_routes(
app,
runnable,
path="/my_runnable",
)
\`\`\`
adds of these endpoints to the server:
- \`POST /my_runnable/invoke\` - invoke the runnable on a single input
- \`POST /my_runnable/batch\` - invoke the runnable on a batch of inputs
- \`POST /my_runnable/stream\` - invoke on a single input and stream the output
- \`POST /my_runnable/stream_log\` - invoke on a single input and stream the output,
including output of intermediate steps as it's generated
- \`POST /my_runnable/astream_events\` - invoke on a single input and stream events as they are generated,
including from intermediate steps.
- \`GET /my_runnable/input_schema\` - json schema for input to the runnable
- \`GET /my_runnable/output_schema\` - json schema for output of the runnable
- \`GET /my_runnable/config_schema\` - json schema for config of the runnable
These endpoints match
the [LangChain Expression Language interface](https://python.langchain.com/docs/expression_language/interface) --
please reference this documentation for more details.
## Playground
You can find a playground page for your runnable at \`/my_runnable/playground/\`. This
exposes a simple UI
to [configure](https://python.langchain.com/docs/expression_language/how_to/configure)
and invoke your runnable with streaming output and intermediate steps.