Architecting scalable Kubernetes applications


  • the database used will be the bottleneck in this design, however, databases can scale, so as requirements lead to billions of rows being effected on a regular basis it may be time to scale, of course you may want to address this sooner than later

Use cases

  • more than one oidc provider (multi-tenant)
  • one entity per tenant
    • identify by email
  • all dates & times are in UTC
  • if using sql use INDEX where appropriate
  • implements one server, e.g. ‘settings’ refers to one server, if more than one server is needed deploy a second server via k8s, similar to how you would deploy a wordpress instance for each website rather than a single wordpress instance to host all websites
  • (optional) implementing an application to run on more than one kubernetes cluster such as an onprem k8s instance as well as in a cloud k8s instance is mentioned though not in detail in this article; it can be done using a single endpoint such as f5 which is setup to loadbalance between the implementations in the two or more different k8s instances, additionally the database servers involved and messaging queue servers would need to be setup in such a way as to safely replicate between the two or more different k8s instances; or the database servers & messaging queue servers could live outside of the k8s implementations (perhaps provided by a third party) making them available to all the k8s implementations


  • entity
    • id (primary_key)
    • tenant (oidc provider: e.g. keycloak, gmail, ldap, etc…)
    • account (unique identifier across all tenants, e.g. email)
  • entity_persona (optional)
    • id (primary_key)
    • entityId
    • source (pc, mobile app, api access)
    • other (e.g. it may be useful to have a different profile pic based on source, or if there are multiple clients per user with api access such as stock trading bots with different trading algorithms owned by the same user)
  • auth (in response to an oidc login, or api token request)
    • id (primary_key)
    • entityId
    • granted (timestamp the token was generated)
    • seconds (value related to bearer token expiry, or in the case of an api token, value of ‘0’ to indicate the token never expires or similar expiry values)
    • state (‘active’, ‘revoked’)
    • token
    • persona (if desired)
  • auth_claims (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
    • value
  • auth_groups (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
  • settings (server settings, requires ‘admin’ access to set, but can be read by all)
    • id (primary_key)
    • name
    • value



All endpoints expect an apiKey token except /login (optional) and /token.  To acquire an apiKey token:

  1. Perform a GET /token or POST /token which will result in an oidc workflow, upon success you will receive an apiKey token.

Optional:  If desired, all endpoints can be configured to handle either a bearer token or an apiKey token, however, as would be expected, this will require an extra check with each endpoint access.


  • implement swagger auto-generation via code comments in order to make consuming the api easy for developers

Rest Api

All REST APIs implemented using port 80 without TLS, instead TLS is provided via a Kubernetes Ingress or Kubernetes-based Mutual TLS (mTLS) solution (optional).

  • /login [GET] (begins the oidc process to login, upon success you get a bearer token & possibly a refresh token) (optional)
  • /token [GET / POST / PUT / DELETE] (begins the oidc process to login, upon success you get an apiKey token, or revoke/delete an existing apiKey token)
  • /callback (after oidc workflow completes this is the callback url used by the oidc provider)
  • /settings [GET / POST / PUT / DELETE] (server global-level settings, requires admin access to add/update/delete)
  • /ws (websocket for event listening, so that clients do not need to poll for updates)
  • /… additional app specific endpoints

Kubernetes Scaling


  • Use an Ingress or LoadBalancer to allow one or more “gateway/api” servers to handle a REST API call
  • Within the REST API implementation use a messaging server such as RabbitMQ to add the request to a messaging queue, then if appropriate either return immediately with a response that the request has been submitted, or wait for a response from an eventing queue indicating the work has been completed and return an appropriate REST API response.
  • Use one or more “implementation/api” servers which watch messaging queues for work to complete, perform the needed work & provide an indication that the work has been completed.  Implement an algorithm which allows “implementation/api” servers to “checkout” work to complete in a simple way such that if the “implementation/api” server which “checkedout” the work crashes, that the work will be picked up by another “implementation/api” server.
  • Denial Of Service (DOS) attack protection to be provided by third-party
  • api rate limit access may be desired and implemented at the REST API level
  • Optional: Allow “gateway/api” servers & “implementation/api” servers to live on more than one kubernetes cluster (see optional section in ‘Use cases’ at this top of this article)


“implementation/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • read from message queue(s)
  • perform work & interact with database in a thread-safe way
  • upon completion submit message to appropriate message queue(s)

“gateway/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • implements REST API methods
  • submit(s) message(s) to appropriate message queue(s)
  • if needed, watches message queue(s) for a response to request(s)
  • returns appropriate response to REST API methods


Posted in Development, Kubernetes.

Leave a Reply