ADR: AI Infrastructure and Architecture
Status
In Progress
Context
Existing AI architecture (as of 26th March 2024)
- AI requests are made to, and processed by, the webserver
- AI features are written in PHP
- AI features use OpenAI's models via Azure hosting
Issues with current approach
Handling AI requests on the webserver in PHP was initially done as it fit our existing infrastructure and was quick and easy to implement, however as our AI features grow we need a better long-term solution.
- Requests can time out: AI-based requests can take an unavoidably long time and this doesn't naturally fit a HTTP request
- PHP isn't suited to AI development: It lacks the AI libraries of other languages
Note: For the purposes of this document an 'AI request' is any request that invokes - or could invoke - an AI model/agent. It does not include requests that support these features but do not invoke any AI model, e.g. creating an element template that will later be used by an AI request.
Decision
Language and Libraries
- AI features SHOULD be written in Python
- Python code MUST conform to PEP 8 style guidelines
- MUST separate business logic from any model-specific implementation. Since AI development is fairly new and rapidly changing, there are many models and libraries whose implementation of similar features are not compatible. These incompatibilities MUST be abstracted away from business logic so that models that are used can be changed in the future
Server Infrastructure
- AI features run on a separate server to web requests to allow for independent scaling
- AI requests are sent to the same domain as web/API requests and are proxied to the AI Python server
- The Python server uses FastAPI
AI Requests
- AI request URLs MUST begin
/ai/ - AI requests SHOULD be asynchronous; they are often time consuming and risk timing out if run synchronously. They SHOULD either:
- Establish a websocket connection for ongoing communication, e.g. a conversation with an AI chatbot
- Acknowledge the request and close the connection before beginning related work, dispatching events with the outcome if necessary
Authentication & Authorisation
Note: This authentication/authorisation mechanism will be replaced as our API is separated from the app.
Python server authenticates and authorises requests in the same way as the websocket server:
- Since the HTTP request is made to the same domain and proxied via the webserver, it has the same headers and cookies as other app requests
- The Python server uses the cookie in the request to make a request to the web server to
/settings/grid/{gridIdent}, and relies on the web server authentication/authorisation
AI Assistants (Chatbots)
Note: This is currently relevant to the AI demos and subject to change.
Creating Assistants
Assistants used by the app are defined in code, not in the Azure/OpenAI interface. There is a script sync-assistants.py that when called will create/update assistants in Azure/OpenAI based on the code.
Each assistant is represented by a class that inherits from the Assistant class and defines:
- The system instructions of the assistant
- The initial message the assistant will send to the user
- When tools the assistant can use (code interpreter etc.)
- Any functions the assistant has access to
Assistant Functions
Assistant functions are declared as methods on the assistant class and are synced as part of the sync-assistants.py script.
To tell the AI assistant that it can call the method, annotate it with @assistant_function, for example:
class ElementBuilderAssistant(Assistant):
@assistant_function
def create_content_area(self, title: str, content: str) -> None:
"""
Create a content area.
:param title: The heading of the new content area.
:param content: The content of the new content area.
:return:
"""
# Do some stuff
The docstring is required and MUST be formatted as above. This is used by the sync-assistants.py script to let the AI assistant know how to correctly call the function.
Questions
Should Python server have direct access to database?
For the AI demos it does not have access to the database. In order to get/save data it makes HTTP calls to the app or dispatches commands/events via rabbitmq. This is different from the scheduling microservice that does have a direct connection to the database. I think I lean towards the HTTP/messaging way of doing things and not having database access.
Assistant syncing across branches
As we move into developing assistants of different staging branches and in prod how to sync assistants and avoid clashes between branches will need further thought.
Consequences
[To be defined as implementation progresses]
Impact
High
Driver
@Marc North
Contributors
[Team]
Accepted Date
[In Progress]