Infrastructure & Data Storage

This page provides detailed technical information about VivaEdu's infrastructure, data storage architecture, and security measures. Designed for institutional IT teams, security officers, and data protection officers.

Summary: All VivaEdu infrastructure is hosted in the AWS UK region (eu-west-2, London) with transcription processing via Microsoft Azure UK South. No third-country data transfers occur. All storage is encrypted at rest and in transit.

Hosting Infrastructure

AWS UK (eu-west-2, London)

VivaEdu is deployed per institution. Each institution is deployed as a dedicated environment (database, cache, storage), with PostgreSQL and Redis hosted in the institutional deployment, and a dedicated bucket for object storage (audio/video/PDFs/exports).

ServiceResourcePurpose
Application HostingAWS Lightsail (Node.js)Web application + background workers
DatabasePostgreSQL (on the Lightsail instance)User data, assignments, sessions, grades
Object StorageAWS S3Audio, video, essays, exports
Cache & QueueRedis (on the Lightsail instance)Job queue, session state
TLS CertificatesTLS certificate on the deploymentHTTPS encryption with auto-renewal

Microsoft Azure UK South

Audio transcription is processed exclusively via Microsoft Azure Speech Services in the UK South region:

  • Speech-to-Text: Audio recordings transcribed to text in real-time
  • Translation: Multilingual viva responses translated back to instructor's language
  • Processing Location: All processing occurs within UK region, no data leaves the UK
  • Compliance: Microsoft Azure UK South is GDPR-compliant with UK data residency guarantees

Microsoft Azure OpenAI UK South (Optional)

If enabled by an institution, VivaEdu can use Azure OpenAI (UK South) for branch follow-up routing. This takes a student's transcript and selects the appropriate teacher-authored follow-up question from a pre-defined set.

  • Purpose: Branch routing (selecting a follow-up question)
  • Input: Student transcript + teacher routing hints + candidate follow-up questions
  • Output: A chosen follow-up question from the teacher-defined candidate set
  • Processing Location: UK South region

OpenAI Text-to-Speech (Question Reading Only)

VivaEdu uses OpenAI Text-to-Speech to read out teacher-authored question text. Student audio/video submissions and student transcripts are not sent to OpenAI.

  • Purpose: Question text-to-speech playback
  • Input: Teacher-authored question text only
  • No student submissions: Student transcripts/audio/video are not sent to OpenAI

Data Storage Architecture

PostgreSQL Database

All structured data is stored in a PostgreSQL 15 database:

  • User accounts: Student and instructor profiles, authentication data
  • Classes: Course sections, enrollments
  • Assignments: Viva configurations, rubrics, due dates
  • Sessions: Student viva attempts, status tracking
  • Responses: Text transcripts of audio responses
  • Grades: Teacher evaluations, rubric scores, feedback text
  • LTI Data: LMS integration metadata, platform configurations
  • Audit Logs: Comprehensive logs of sensitive actions

Object Storage (AWS S3)

Large files are stored in S3 with structured key prefixes:

  • Audio recordings: Student viva responses (.webm, .mp3 format)
  • Video recordings: Optional student response videos, instructor video prompts and feedback
  • Essay uploads: PDF and DOCX files submitted by students
  • Question images: Images and diagrams used in viva questions
  • Context cards: Reference materials (PDFs up to 10 pages, images)
  • Export archives: ZIP files generated for data exports

Data Isolation

VivaEdu implements application-level data isolation:

  • Access Control: All database queries include authorization checks based on user role and institutional relationships
  • Class Scoping: Teachers can only access classes they created or teach
  • Student Scoping: Students can only access classes they're enrolled in and their own submissions
  • LMS Context: LTI launches include context information ensuring users only see data from their institution's LMS
  • Signed URLs: S3 objects use time-limited signed URLs (15 minutes to 7 days) with strict permissions
  • Demo Isolation: Demo environments use tenant IDs with automatic 2-hour expiration and complete data deletion

Security Measures

Encryption

  • In Transit:
    • TLS 1.2+ for all HTTPS connections
    • Enforced HTTPS redirects (no plain HTTP)
    • AWS Certificate Manager for certificate management
    • TLS-enabled Redis connections
  • At Rest:
    • S3: Server-Side Encryption with SSE-S3
    • RDS: Full disk encryption enabled
    • ElastiCache: Encryption at rest enabled

Network Security

  • VPC Configuration: Isolated VPC with public and private subnets
  • Security Groups:
    • RDS: PostgreSQL port (5432) accessible only from Elastic Beanstalk instances
    • ElastiCache: Redis port (6379) accessible only from Elastic Beanstalk instances
    • Application: HTTP/HTTPS from Application Load Balancer only
  • Private Subnets: Database and cache instances in private subnets (no direct internet access)
  • IAM Roles: Elastic Beanstalk instances use IAM roles (not access keys) for S3, RDS, and ElastiCache access

Application Security

  • Authentication: JWT-based with secure refresh tokens, bcrypt password hashing
  • Authorization: Role-based access control (STUDENT, TEACHER, ADMIN) checked on every request
  • Input Validation: All API inputs validated using Zod schemas
  • File Upload Restrictions: MIME type validation, size limits (100MB), content type enforcement
  • Content Security Policy: CSP headers restrict embedding to trusted LMS domains only
  • Rate Limiting: Login attempts and file uploads are rate-limited
  • XSS Protection: React escaping + markdown sanitization

AWS Certifications

AWS UK infrastructure holds the following certifications:

  • ISO 27001: Information security management systems
  • ISO 27017: Cloud-specific information security controls
  • ISO 27018: Protection of personally identifiable information in public clouds
  • SOC 2 Type II: Service organization control reports
  • GDPR: AWS Data Processing Agreement and GDPR-compliant data handling

Data Processing Flows

Student Viva Submission

  1. Student records audio/video in browser (WebRTC)
  2. Browser uploads to AWS S3 via signed URL
  3. Background worker picks up transcription job from Redis queue
  4. Audio sent to Azure Speech UK South for transcription
  5. Transcript stored in PostgreSQL (RDS)
  6. Student session marked as complete
  7. Instructor notified (if enabled)

Teacher Review

  1. Teacher accesses review queue via web interface
  2. Application fetches session data from PostgreSQL
  3. Audio/video files served via time-limited signed S3 URLs (15 minutes)
  4. Teacher submits grades and feedback
  5. Grade data saved to PostgreSQL
  6. Optional: grades pushed back to LMS via LTI AGS

Data Retention Worker

  1. Daily scheduled job runs via background worker
  2. Finds assignments with due date + 90 days passed
  3. Deletes audio and video files from S3
  4. Finds assignments with due date + 180 days passed
  5. Deletes transcripts and full session data from PostgreSQL
  6. Grade records retained per institutional policy
  7. All operations logged to audit system

Backup & Disaster Recovery

  • Instance snapshots: Recommended daily Lightsail snapshots for institutional deployments, with retention agreed per institution
  • S3 durability: High durability with redundancy across multiple availability zones (AWS service feature)
  • RPO/RTO: Defined per deployment based on snapshot configuration and tested restore procedures

Monitoring & Logging

Application Logging

  • Application logs: Operational logs are used for troubleshooting and incident response
  • Audit Logs: Sensitive actions logged to PostgreSQL with admin access
  • Worker Logs: Background job execution and errors logged
  • Access logs: Web server/application access logging as configured for the deployment

Metrics & Health Monitoring

  • Instance monitoring: CPU, memory, disk, network monitoring as configured for the deployment
  • Database monitoring: Database health checks and performance observation as configured

Scalability

Current Configuration

  • Web Tier: Auto-scaling based on CPU and request load
  • Worker Tier: Separate environment scales independently
  • Database: db.t4g.micro (can be upgraded vertically)
  • Cache: cache.t4g.micro (can be upgraded vertically)

Scaling Approach

  • Horizontal Scaling: Add more web/worker instances as load increases
  • Vertical Scaling: Upgrade RDS and ElastiCache instance sizes
  • Future Enhancements: RDS read replicas, ElastiCache cluster mode, multi-region deployment

Compliance & Auditing

  • GDPR Compliance: All data processing documented, data processing agreements in place
  • Data Residency: UK-only data storage and processing for UK institutions
  • Audit Trails: Comprehensive logging of data access, exports, deletions
  • Subject Access Requests: Data export functionality for GDPR compliance
  • Right to Erasure: Account deletion with data purge capabilities

Important Note

This documentation reflects the current technical architecture. VivaEdu reserves the right to make infrastructure changes as needed for performance, security, or compliance reasons. Material changes will be communicated to institutional partners with advance notice.

Comments

Leave a comment, question, or feedback. Comments are public — please don’t include personal data.

Loading comments…