More components

Plugboard's Component objects can run anything you can code in Python. This includes:

Using your own or third-party Python packages;
External calls to APIs, e.g. data sources or hosted models;
Shell commands on your own machine, for example to execute third-party binaries that you want to integrate with.

It even ships with some pre-built components in plugboard.library to help you with common tasks.

Info

Plugboard was originally built to help data scientists working on industrial process simulations. Python provides a familar environment to integrate different parts of a simulation, for example combining the output of a traditional process control simulation with a machine-learning model.

In this tutorial we'll build a model to process data through an LLM and showcase some different components along the way.

Using an LLM in Plugboard

We're going to build a model that loads rows of data from a CSV and then uses an LLM to extract information about the geographical place referred to in each row. We'll then query an API to get the latest weather for each location.

 flowchart LR
    llm@{ shape: rounded, label: LLMChat<br>**llm** } --> weather@{ shape: rounded, label: WeatherAPI<br>**weather** }
    llm@{ shape: rounded, label: LLMChat<br>**llm** } --> save-results@{ shape: rounded, label: FileWriter<br>**save-results** }
    load-text@{ shape: rounded, label: FileReader<br>**load-text** } --> llm@{ shape: rounded, label: LLMChat<br>**llm** }
    weather@{ shape: rounded, label: WeatherAPI<br>**weather** } --> save-results@{ shape: rounded, label: FileWriter<br>**save-results** }

Loading and saving data

In previous tutorials we wrote our own components for reading/writing files. Here we are going to use the built-in FileReader and FileWriter components. These are much more useful for building practical models, as they can access a variety of file formats both locally and in cloud storage.

load_text = FileReader(name="load-text", path="input.csv", field_names=["text"])
save_output = FileWriter(
    name="save-results",
    path="output.csv",
    field_names=["location", "temperature", "wind_speed"],
)

Sending the data to an LLM

We're going to use the LLMChat to access OpenAI:

llm = LLMChat(
    name="llm",
    system_prompt="Identify a geographical location from the input and provide its latitude and longitude",
    response_model=Location,
    expand_response=True,  # (2)!
)

LLMChat can use structured output to process the LLM response into a known format. Here we define a Pydantic model that specifies everything we're expecting back.
Setting expand_response = True will unpack location, latitude and longitude into separate outputs on the component.

Info

To run this tutorial you'll need an API key for OpenAI. Set the OPENAI_API_KEY environment variable to provide it to the model.

Since LLMChat is based on LlamaIndex you can even try reconfiguring LLMChat to use a different LLM.

Querying a weather API

We can now define a component to query a weather API and get temperature and wind speed for a given location.

name="__codelineno-2-1" href="#__codelineno-2-1">class WeatherAPI(Component): """Get current weather for a location.""" io = IO(inputs=["latitude", "longitude"], outputs=["temperature", "wind_speed"]) def __init__(self, **kwargs: _t.Unpack[ComponentArgsDict]) -> None: super().__init__(**kwargs) self._client = httpx.AsyncClient() async def step(self) -> None: response = await self._client.get( "https://api.open-meteo.com/v1/forecast", params={ "latitude": self.latitude, "longitude": self.longitude, "current": "temperature_2m,wind_speed_10m", }, ) try: response.raise_for_status() except httpx.HTTPStatusError: self._logger.error( "Error querying weather API", code=response.status_code, message=response.text, ) return data = response.json() self.temperature = data["current"]["temperature_2m"] self.wind_speed = data["current"]["wind_speed_10m"] class="n">weather = WeatherAPI(name="weather")

Info

See how we used self._logger to record log messages. All Plugboard Component objects have a structlog logger on the _logger attribute. See configuration for more information on configuring the logging.

Putting it all together

As usual, we can link all our components together in a LocalProcess and run them as follows:

connect = lambda in_, out_: AsyncioConnector(
    spec=ConnectorSpec(source=in_, target=out_)
)
process = LocalProcess(
    components=[load_text, llm, weather, save_output],
    connectors=[
        connect("load-text.text", "llm.prompt"),
        connect("llm.latitude", "weather.latitude"),
        connect("llm.longitude", "weather.longitude"),
        connect("llm.location", "save-results.location"),
        connect("weather.temperature", "save-results.temperature"),
        connect("weather.wind_speed", "save-results.wind_speed"),
    ],
)
async with process:
    await process.run()

Check out the output.csv file to see all of the collected model output.