More components
Plugboard's Component
objects can run anything you can code in Python. This includes:
- Using your own or third-party Python packages;
- External calls to APIs, e.g. data sources or hosted models;
- Shell commands on your own machine, for example to execute third-party binaries that you want to integrate with.
It even ships with some pre-built components in plugboard.library
to help you with common tasks.
Info
Plugboard was originally built to help data scientists working on industrial process simulations. Python provides a familar environment to integrate different parts of a simulation, for example combining the output of a traditional process control simulation with a machine-learning model.
In this tutorial we'll build a model to process data through an LLM and showcase some different components along the way.
Using an LLM in Plugboard
We're going to build a model that loads rows of data from a CSV and then uses an LLM to extract information about the geographical place referred to in each row. We'll then query an API to get the latest weather for each location.
graph LR;
FileReader(load-text)-->LLMChat(llm);
LLMChat(llm)-->WeatherAPI(weather-api);
WeatherAPI(weather-api)---->FileWriter(save-results);
LLMChat(llm)---->FileWriter(save-results);
Loading and saving data
In previous tutorials we wrote our own components for reading/writing files. Here we are going to use the built-in FileReader
and FileWriter
components. These are much more useful for building practical models, as they can access a variety of file formats both locally and in cloud storage.
load_text = FileReader(name="load-text", path="input.csv", field_names=["text"])
save_output = FileWriter(
name="save-results",
path="output.csv",
field_names=["location", "temperature", "wind_speed"],
)
Sending the data to an LLM
We're going to use the LLMChat
to access OpenAI:
llm = LLMChat(
name="llm",
system_prompt="Identify a geographical location from the input and provide its latitude and longitude",
response_model=Location,
expand_response=True, # (2)!
)
LLMChat
can use structured output to process the LLM response into a known format. Here we define a Pydantic model that specifies everything we're expecting back.- Setting
expand_response = True
will unpacklocation
,latitude
andlongitude
into separate outputs on the component.
Info
To run this tutorial you'll need an API key for OpenAI. Set the OPENAI_API_KEY
environment variable to provide it to the model.
Since LLMChat
is based on LlamaIndex you can even try reconfiguring LLMChat
to use a different LLM.
Querying a weather API
We can now define a component to query a weather API and get temperature and wind speed for a given location.
class WeatherAPI(Component):
"""Get current weather for a location."""
io = IO(inputs=["latitude", "longitude"], outputs=["temperature", "wind_speed"])
def __init__(self, **kwargs: _t.Unpack[ComponentArgsDict]) -> None:
super().__init__(**kwargs)
self._client = httpx.AsyncClient()
async def step(self) -> None:
response = await self._client.get(
"https://api.open-meteo.com/v1/forecast",
params={
"latitude": self.latitude,
"longitude": self.longitude,
"current": "temperature_2m,wind_speed_10m",
},
)
try:
response.raise_for_status()
except httpx.HTTPStatusError as e:
print(f"Error querying weather API: {e}")
return
data = response.json()
self.temperature = data["current"]["temperature_2m"]
self.wind_speed = data["current"]["wind_speed_10m"]
weather = WeatherAPI(name="weather")
Putting it all together
As usual, we can link all our components together in a LocalProcess
and run them as follows:
connect = lambda in_, out_: AsyncioConnector(
spec=ConnectorSpec(source=in_, target=out_)
)
process = LocalProcess(
components=[load_text, llm, weather, save_output],
connectors=[
connect("load-text.text", "llm.prompt"),
connect("llm.latitude", "weather.latitude"),
connect("llm.longitude", "weather.longitude"),
connect("llm.location", "save-results.location"),
connect("weather.temperature", "save-results.temperature"),
connect("weather.wind_speed", "save-results.wind_speed"),
],
)
async with process:
await process.run()
Check out the output.csv
file to see all of the collected model output.