ADR: AI Infrastructure and Architecture

Status

In Progress

Context

Existing AI architecture (as of 26th March 2024)

AI requests are made to, and processed by, the webserver
AI features are written in PHP
AI features use OpenAI's models via Azure hosting

Issues with current approach

Handling AI requests on the webserver in PHP was initially done as it fit our existing infrastructure and was quick and easy to implement, however as our AI features grow we need a better long-term solution.

Requests can time out: AI-based requests can take an unavoidably long time and this doesn't naturally fit a HTTP request
PHP isn't suited to AI development: It lacks the AI libraries of other languages

Note: For the purposes of this document an 'AI request' is any request that invokes - or could invoke - an AI model/agent. It does not include requests that support these features but do not invoke any AI model, e.g. creating an element template that will later be used by an AI request.

Decision

Language and Libraries

AI features SHOULD be written in Python
Python code MUST conform to PEP 8 style guidelines
MUST separate business logic from any model-specific implementation. Since AI development is fairly new and rapidly changing, there are many models and libraries whose implementation of similar features are not compatible. These incompatibilities MUST be abstracted away from business logic so that models that are used can be changed in the future

Server Infrastructure

AI features run on a separate server to web requests to allow for independent scaling
AI requests are sent to the same domain as web/API requests and are proxied to the AI Python server
The Python server uses FastAPI

AI Requests

AI request URLs MUST begin /ai/
AI requests SHOULD be asynchronous; they are often time consuming and risk timing out if run synchronously. They SHOULD either:
Establish a websocket connection for ongoing communication, e.g. a conversation with an AI chatbot
Acknowledge the request and close the connection before beginning related work, dispatching events with the outcome if necessary

Authentication & Authorisation

Note: This authentication/authorisation mechanism will be replaced as our API is separated from the app.

Python server authenticates and authorises requests in the same way as the websocket server:

Since the HTTP request is made to the same domain and proxied via the webserver, it has the same headers and cookies as other app requests
The Python server uses the cookie in the request to make a request to the web server to /settings/grid/{gridIdent}, and relies on the web server authentication/authorisation

AI Assistants (Chatbots)

Note: This is currently relevant to the AI demos and subject to change.

Creating Assistants

Assistants used by the app are defined in code, not in the Azure/OpenAI interface. There is a script sync-assistants.py that when called will create/update assistants in Azure/OpenAI based on the code.

Each assistant is represented by a class that inherits from the Assistant class and defines: - The system instructions of the assistant - The initial message the assistant will send to the user - When tools the assistant can use (code interpreter etc.) - Any functions the assistant has access to

Assistant Functions

Assistant functions are declared as methods on the assistant class and are synced as part of the sync-assistants.py script.

To tell the AI assistant that it can call the method, annotate it with @assistant_function, for example:

class ElementBuilderAssistant(Assistant):

    @assistant_function
    def create_content_area(self, title: str, content: str) -> None:
        """
        Create a content area.
        :param title: The heading of the new content area.
        :param content: The content of the new content area.
        :return:
        """
        # Do some stuff

The docstring is required and MUST be formatted as above. This is used by the sync-assistants.py script to let the AI assistant know how to correctly call the function.

Questions

Should Python server have direct access to database?

For the AI demos it does not have access to the database. In order to get/save data it makes HTTP calls to the app or dispatches commands/events via rabbitmq. This is different from the scheduling microservice that does have a direct connection to the database. I think I lean towards the HTTP/messaging way of doing things and not having database access.

Assistant syncing across branches

As we move into developing assistants of different staging branches and in prod how to sync assistants and avoid clashes between branches will need further thought.

Consequences

[To be defined as implementation progresses]

Impact

High

Driver

@Marc North

Contributors

[Team]

Accepted Date

[In Progress]

Resources

Last modified by: Unknown