Platform Architecture

🟢 dev@learneezi.ai

01 — Client Layer

📱 MobileApp

iOS / Android

🖥 AdminWeb

Browser

HTTPS /api/learning/app

HTTPS /api/auth

02 — NGINX Ingress Controller

🔀 NGINX Ingress

TLS Termination · Load Balancing · Routing

Route to Services

03 — API Gateway

🔀 Gateway

Request Router + Rate Limiter

⚡ Redis

Rate Limit Cache

JWT Validate

Forward Request

04 — Application Services

🔐 AuthService

Authentication

🎓 LearningService

Core Business Logic

⚙️ AdminService

Admin Panel APIs

Read / Write

Cache / Lock

Push

Objects

05 — Infrastructure & External Services

🗄 AuthDB

PostgreSQL

🗄 LearningDB

PostgreSQL HA

⚡ Redis HA

Cache / Lock / Idempotency

💾 Longhorn

Distributed Storage

🔔 FCM

Push Notifications

🗂 S3 Storage

Object Store

NGINX Ingress

Application Service

Infrastructure

Client / Storage

External Service

📱 App APIs

ExamFlow

Rewards

Leaderboard

Content

Billing

⚙️ Admin APIs

Ranking

Rewards

Jobs

LearningService

CORE BUSINESS LOGIC

Central service handling all learning platform operations — exams, rankings, rewards, notifications, content delivery, and billing.

🎮 Game / Exam Flow

🏆 Ranking + Publish

🎁 Rewards + Claims

🔔 Notifications

📊 Leaderboard

📚 Content / Progress

💳 Billing / Subscriptions

⚡ Cache / Lock Layer

Redis Usage

Submit Idempotency Ranking Lock Throttling Leaderboard Cache Master Data Cache

🗄 Data Stores

LearningDB (PG)

Redis

S3 Object Store

🔌 External

FCM / APNS

⏱ Scheduling

Rank Exam @ 2AM

Admin Manual Trigger

Broadcast Optimization

📱

Mobile App

1Start attempt → API

2Receive attemptToken + question stream

3Submit response per question (choiceId, elapsedMs)

4Complete attempt → API

5Open push notification → deeplink API call

6Receive final screen data

⚙️

Learning Service

1Receive start → idempotency lock in Redis

2Create IN_PROGRESS attempt in DB

3Return attemptToken + stream questions

4Receive complete → throttle + idempotency checks

5Persist completion + PENDING result in DB

6Rank exam (2AM scheduled or admin trigger)

7Compute rank + economics + prize awards

8Publish results → send FCM notifications

🗄

Learning DB

1Write: Create IN_PROGRESS attempt

2Write: Persist response + timing per question

3Write: Persist attempt completion + PENDING result

4Write: Compute rank + economics + prize awards

5Write: Publish results

6Read: Fetch target data on deeplink action

⚡

Redis

1Start idempotency lock on attempt start

2Submit idempotency check on complete

3Throttle check on attempt complete

4Ranking lock during rank computation

5Rate-limit checks at Gateway layer

6Leaderboard + question-pool + master data cache

🔔

FCM / APNS

1Receive push payload from LearningService

2Dispatch winner/result notifications

3Route to APNS for iOS devices

4Deliver notification to client device

5Broadcast: no per-user persistence

6Dispatch audit maintained separately

Redis Usage

Submit/start idempotency locks

Ranking lock during computation

Request throttling at API level

Leaderboard real-time cache

Question pool + master data caching

Gateway rate-limit checks

Reward Status Flow

ASSIGNED → REQUESTED → APPROVED / REJECTED → FULFILLED

CANCELLED for invalidated / withdrawn flow

Coupon Security Model

Coupon code not exposed in result metadata

Coupon resolved only at claim flow

Surfaced only to eligible learner

Resolved server-side — never in API response until claimed

Broadcast Notification Optimization

Per-user persistence avoided for global broadcasts

Dispatch audit maintained separately

Reduces DB write amplification at scale

FCM handles fan-out to devices directly

Ranking Computation

Scheduled daily at 2AM (automatic)

Admin manual trigger available outside schedule

Redis ranking lock prevents concurrent computation

Computes rank + economics + prize awards atomically

Publishes results → triggers FCM notifications

Request Security Flow

All requests carry JWT token

Gateway forwards to AuthService for introspection if needed

Redis rate-limit check at Gateway layer

Validated requests forwarded to LearningService

Idempotency enforced at service layer via Redis

⎈ Deployment Scenarios

🔷 Single Worker Node Dev / Staging

EKS Cluster

⚙️ Control Plane (Managed AWS)

kube-apiserver etcd scheduler controller-manager

↕ kubelet

🖥 Worker Node × 1

nginx-ingress gateway-svc auth-svc learning-svc admin-svc redis postgres longhorn-mgr cert-manager

All pods run on the same node — no redundancy

Node failure = complete service outage

Longhorn stores PVs locally on single disk

Redis / PG single instance, no replication

Use case: dev, testing, cost-optimized staging

🟢 Multiple Worker Nodes Production HA

EKS Cluster — HA

⚙️ Control Plane (Managed AWS Multi-AZ)

kube-apiserver etcd ×3 scheduler controller-manager

↕ kubelet per node

🖥 Worker Node 1

nginx-ingress gateway-svc auth-svc redis-primary longhorn-node

🖥 Worker Node 2

learning-svc admin-svc postgres-primary redis-replica longhorn-node

🖥 Worker Node 3

postgres-standby redis-sentinel learning-svc longhorn-node cert-manager

Pod anti-affinity spreads replicas across nodes

Node failure — workloads reschedule to healthy nodes

Longhorn replicates data across all 3 nodes (replica=3)

Redis Sentinel auto-failover — promotes replica on primary failure

Postgres HA — streaming replication + automatic failover

🔧 Infrastructure Components

🔀 NGINX Ingress Controller

TLS termination (cert-manager + Let's Encrypt) Routes external HTTPS → internal ClusterIP services AWS ELB → NGINX → Services SSL redirect + force HTTPS enforced Namespace: ingress-nginx

💾 Longhorn Storage

Distributed block storage for Kubernetes PVs Replication factor: 3 (across worker nodes) Used by: PostgreSQL, Redis persistent volumes Automatic volume snapshots + S3 backup Web UI available via ingress

⚡ Redis HA (Sentinel)

1 Primary + 2 Replicas + 3 Sentinels Sentinel monitors primary health continuously Auto-promotes replica on primary failure Spring Boot / apps connect via Sentinel endpoint PV backed by Longhorn

🗄 PostgreSQL HA

Primary + Standby with streaming replication Automatic failover via Patroni / operator Separate instances: AuthDB + LearningDB Automated S3 backups (scheduled) PV backed by Longhorn (EBS on AWS)

🎓 Learning Module

LearningService — core exam + ranking + rewards Deployment: 2+ replicas for HA HPA: scales on CPU/memory thresholds Connects to LearningDB + Redis + FCM + S3 Namespace: learneezi

🔀 Gateway Module

Request routing + JWT pass-through validation Rate limiting via Redis Forwards to AuthService / LearningService Deployment: 2+ replicas Sits behind NGINX Ingress

🔐 Authentication Module

AuthService — JWT issuance + validation Reads/writes to AuthDB (PostgreSQL) Stateless — horizontally scalable Deployment: 2+ replicas Token introspection on demand

⚙️ Admin Module

AdminService — admin panel APIs Reward approval / rejection / fulfillment Manual ranking trigger + exam publishing Restricted access — internal routes only Deployment: 1–2 replicas

✅ High Availability Summary

💾 Storage HA

Longhorn replicates every volume across 3 nodes. No single disk failure can cause data loss. Automatic failover for PVs.

⚡ Redis HA

Sentinel-based failover. Primary failure detected within seconds. Replica promoted automatically. Zero manual intervention.

🗄 PostgreSQL HA

Streaming replication to standby. Automated failover promotes standby to primary. S3 backups for point-in-time recovery.

🔀 Ingress HA

NGINX controller runs as DaemonSet or multiple replicas. AWS ELB distributes traffic. TLS auto-renewed by cert-manager.

🎓 Service HA

All application services deploy with 2+ replicas. Pod anti-affinity rules prevent co-location. HPA scales on load automatically.

☁️ Node HA (Multi-AZ)

Worker nodes spread across AWS Availability Zones. AZ failure only takes down a subset. Kubernetes reschedules within minutes.

Platform Architecture Overview