Indra Clinic

Chatbot: Technical & Privacy Information

1. Technical Framework

The chatbot is a Python-based application designed to operate as a conversational web agent. Its architecture is built on several key components:

Core Application: The main application is built using the Flask web framework, serving a REST API to a web-based chat interface. It operates as a state machine, guiding users through predefined conversational flows (e.g., consent, verification, query category, data collection).

AI & Natural Language Processing: The bot integrates with a large language model (openai/gpt-4o-mini) via the Microsoft Azure OpenAI Service. The AI's behavior, personality, and operational constraints are strictly controlled by a detailed system_prompt.txt file, which serves as its core instruction set.

Clinical System Integration (EMR): The application connects securely to the Semble Electronic Medical Record system using its GraphQL API. This allows the bot to create new clinical notes (FreeTextRecord) in a patient's file after successful verification.

Email Service: The application sends transactional emails (e.g., query confirmations, transcripts) using Python's standard smtplib library. While the code is provider-agnostic, the current operational provider for this service is SendGrid.

API Operations: The httpx library is used to handle all external API calls synchronously.

Deployment: The application is deployed to Microsoft Azure App Service in the UK South region. All sensitive credentials (API keys, email passwords) are managed securely as environment variables rather than being hard-coded in the source.

2. Privacy and Data Compliance (UK-GDPR)

The chatbot's design incorporates key principles of UK-GDPR to ensure patient data is handled lawfully and securely.

Lawful Basis for Processing (Consent): The bot's primary lawful basis for processing patient data is explicit consent. Before any personal information is requested, the user must actively type "I agree" to proceed with the service. This ensures processing is transparent and consensual.

Data Residency and Sovereignty: To comply with regulations for handling UK health data, all components of the application are hosted within Microsoft Azure's UK South region. This includes the App Service running the Python code and the Azure OpenAI service processing the conversation. This ensures that all patient data, both "at rest" and "in transit" for AI processing, remains within the UK's geographic and legal jurisdiction.

Pseudonymity in Conversation: The chatbot is designed to maintain patient privacy. It never asks for the patient's full name. The user provides their email and Patient ID. The email address is used to securely link the chat session to the correct patient file within the Semble EMR system. The Patient ID serves as a secondary verification step and is included in the final report for staff review.

Data Minimisation: The bot collects only the data necessary for its specific purpose. For verification, it collects only the patient's email address and Patient ID. For appointment changes, it collects only the current and desired appointment times.

Purpose Limitation: Data collected is used for the sole purpose of creating a clinical or administrative report for Indra Clinic staff and adding it to the patient's medical record. The AI is instructed that its data is not used for model training.

Data Storage & Retention: Data is held temporarily in the bot's memory and is cleared at the end of the session. Upon completion, a permanent record is created in the Semble EMR, and a copy is also sent via email. Data retention is thereafter governed by the clinic's policies for medical records and email.

User Rights (Right of Access): The system is designed to automatically offer a full transcript of the conversation to the patient's registered email address, fulfilling their right to access a copy of the data they have provided.

3. Safety Features

Several features have been built in to mitigate risks and prioritize patient safety.

Prohibition of Medical Advice: The bot's most critical safety feature is the explicit instruction in its system prompt that it must not provide medical advice. Its role is strictly limited to information gathering for a human clinician.

Red Flag Emergency Detection: The AI's instructions contain a workflow to screen for signs of a medical emergency (e.g., severe chest pain, thoughts of self-harm). If detected, the bot halts the conversation, provides an instruction to call 999, and flags the session for reporting.

Human-in-the-Loop Design: The chatbot is not an autonomous agent. Every summary it generates is intended for review by a qualified member of the Indra Clinic team. All clinical decisions are made by human professionals.

Explicit Consent: The bot does not engage until the user has read the privacy notice and consented.

Structured Administrative Workflows: For tasks like changing an appointment, the bot follows a rigid, scripted workflow that does not involve the AI, minimizing the possibility of error.

4. Addressing Specific Compliance & Security Criticisms

This section outlines the system's position on identified compliance and security considerations.

Confidentiality

Identification: The system uses a two-factor verification method requiring both email and Patient ID. The risk of misidentification is mitigated as an unauthorised user would need access to the patient's private chat session, know their registered email, and know their Patient ID.

Transcript Distribution: The primary record is the note created in the Semble EMR. The email to the patient serves to fulfill their Right of Access under UK-GDPR and contains a confidentiality notice.

AI Data Exposure: The AI has no direct access to patient records. Only anonymous conversation text is sent to the Microsoft Azure OpenAI Service via an encrypted connection. This data is not used for model training.

Security

Third-Party API Dependence: Risk is mitigated by selecting reputable providers (Microsoft Azure for hosting and AI, Semble for EMR) with a strong commitment to security and UK healthcare compliance.

Cloud Deployment Risks: The application is hosted on Microsoft Azure, a platform with publicly documented compliance for UK healthcare and alignment with the NHS DSP Toolkit.

Credential Management: The use of environment variables is the industry best practice for separating secrets from code.

Compliance (UK-GDPR, NHS, ICO)

Lawful Basis: For existing patients, the primary lawful basis for processing is the "provision of direct care" (Article 9(2)(h)). The "I agree" step serves as a transparent agreement to this method of communication.

Right to Erasure: GDPR rights do not override the legal obligation for clinicians to maintain accurate medical records. A transcript in the Semble EMR is subject to medical record retention laws.

Data Retention: Conversation data in memory is transient. The final report is subject to the retention policies of the Semble EMR and the clinic's email system.

International Transfers: The architecture uses a multi-faceted approach. Core application hosting and AI processing occur within Microsoft Azure's UK data centres, ensuring this data remains within the UK. For transactional email, the system uses SendGrid, a US-based service. The legal basis for this international data transfer is established through SendGrid's Data Processing Addendum (DPA), which incorporates Standard Contractual Clauses (SCCs), and the certification of its parent company, Twilio Inc., under the UK-U.S. Data Privacy Framework (the "Data Bridge").