Architecting scalable Kubernetes applications


  • the database used will be the bottleneck in this design, however, databases can scale, so as requirements lead to billions of rows being effected on a regular basis it may be time to scale, of course you may want to address this sooner than later

Use cases

  • more than one oidc provider (multi-tenant)
  • one entity per tenant
    • identify by email
  • all dates & times are in UTC
  • if using sql use INDEX where appropriate
  • implements one server, e.g. ‘settings’ refers to one server, if more than one server is needed deploy a second server via k8s, similar to how you would deploy a wordpress instance for each website rather than a single wordpress instance to host all websites
  • (optional) implementing an application to run on more than one kubernetes cluster such as an onprem k8s instance as well as in a cloud k8s instance is mentioned though not in detail in this article; it can be done using a single endpoint such as f5 which is setup to loadbalance between the implementations in the two or more different k8s instances, additionally the database servers involved and messaging queue servers would need to be setup in such a way as to safely replicate between the two or more different k8s instances; or the database servers & messaging queue servers could live outside of the k8s implementations (perhaps provided by a third party) making them available to all the k8s implementations


  • entity
    • id (primary_key)
    • tenant (oidc provider: e.g. keycloak, gmail, ldap, etc…)
    • account (unique identifier across all tenants, e.g. email)
  • entity_persona (optional)
    • id (primary_key)
    • entityId
    • source (pc, mobile app, api access)
    • other (e.g. it may be useful to have a different profile pic based on source, or if there are multiple clients per user with api access such as stock trading bots with different trading algorithms owned by the same user)
  • auth (in response to an oidc login, or api token request)
    • id (primary_key)
    • entityId
    • granted (timestamp the token was generated)
    • seconds (value related to bearer token expiry, or in the case of an api token, value of ‘0’ to indicate the token never expires or similar expiry values)
    • state (‘active’, ‘revoked’)
    • token
    • persona (if desired)
  • auth_claims (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
    • value
  • auth_groups (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
  • settings (server settings, requires ‘admin’ access to set, but can be read by all)
    • id (primary_key)
    • name
    • value



All endpoints expect an apiKey token except /login (optional) and /token.  To acquire an apiKey token:

  1. Perform a GET /token or POST /token which will result in an oidc workflow, upon success you will receive an apiKey token.

Optional:  If desired, all endpoints can be configured to handle either a bearer token or an apiKey token, however, as would be expected, this will require an extra check with each endpoint access.


  • implement swagger auto-generation via code comments in order to make consuming the api easy for developers

Rest Api

All REST APIs implemented using port 80 without TLS, instead TLS is provided via a Kubernetes Ingress or Kubernetes-based Mutual TLS (mTLS) solution (optional).

  • /login [GET] (begins the oidc process to login, upon success you get a bearer token & possibly a refresh token) (optional)
  • /token [GET / POST / PUT / DELETE] (begins the oidc process to login, upon success you get an apiKey token, or revoke/delete an existing apiKey token)
  • /callback (after oidc workflow completes this is the callback url used by the oidc provider)
  • /settings [GET / POST / PUT / DELETE] (server global-level settings, requires admin access to add/update/delete)
  • /ws (websocket for event listening, so that clients do not need to poll for updates)
  • /… additional app specific endpoints

Kubernetes Scaling


  • Use an Ingress or LoadBalancer to allow one or more “gateway/api” servers to handle a REST API call
  • Within the REST API implementation use a messaging server such as RabbitMQ to add the request to a messaging queue, then if appropriate either return immediately with a response that the request has been submitted, or wait for a response from an eventing queue indicating the work has been completed and return an appropriate REST API response.
  • Use one or more “implementation/api” servers which watch messaging queues for work to complete, perform the needed work & provide an indication that the work has been completed.  Implement an algorithm which allows “implementation/api” servers to “checkout” work to complete in a simple way such that if the “implementation/api” server which “checkedout” the work crashes, that the work will be picked up by another “implementation/api” server.
  • Denial Of Service (DOS) attack protection to be provided by third-party
  • api rate limit access may be desired and implemented at the REST API level
  • Optional: Allow “gateway/api” servers & “implementation/api” servers to live on more than one kubernetes cluster (see optional section in ‘Use cases’ at this top of this article)


“implementation/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • read from message queue(s)
  • perform work & interact with database in a thread-safe way
  • upon completion submit message to appropriate message queue(s)

“gateway/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • implements REST API methods
  • submit(s) message(s) to appropriate message queue(s)
  • if needed, watches message queue(s) for a response to request(s)
  • returns appropriate response to REST API methods


(bash) script to fully clean up a kubernetes app

When experimenting with something like ceph, installing it, making changes, uninstalling and reinstalling … you will find that more advanced apps tend to implement finalizers making a full uninstall rather challenging.  More complex apps tend to have an uninstaller script for just this reason.  When lacking such a script though, here is a generic script which can take care of a lot, or all of the clean up work:

** Note, you are entering a danger zone. **


if [ "$1" == "" ]
  echo "Syntax:"
  echo ""
  echo "$0 <searchstr>"

  exit 1

CRDS=`kubectl get crd --no-headers | grep $1 | cut -d ' ' -f 1`

array_crds=( $CRDS )
for crd in "${array_crds[@]}"
  echo ""
  echo $crd
  RESOURCES=`kubectl get $crd --no-headers | grep $1 | cut -d ' ' -f 1`
  array_resources=( $RESOURCES )
  for next in "${array_resources[@]}"
    echo $next
    kubectl patch $crd $next -p '{"metadata":{"finalizers":null}}' --type=merge
    kubectl delete $crd $next

(gitops) argocd phoenix configuration: clusterapi with vcluster provider

Standardized git repo layouts helps to keep deployments consistent and clean:

- /appofapps/clusters/application.yaml
- /apps
  - /argocd-seed/
  - /argocd/applicationset.yaml
  - /clusterapi/applicationset.yaml
  - /daytwo/applicationset.yaml
- /projects
  - /addons.yaml
  - /developer.yaml
  - /devsecops.yaml

- /apps
  - /adcs-issuer-system/applicationset.yaml
  - /adcs-issuer-system/base/Chart.yaml
  - /cert-manager/applicationset.yaml
  - /external-dns/applicationset.yaml
  - /external-dns-root/applicationset.yaml
  - /fluent-bit/applicationset.yaml
  - /kasten/applicationset.yaml
  - /nginx-ingress/applicationset.yaml
  - /metrics-server/applicationset.yaml
  - /pinniped-concierge/applicationset.yaml
  - /prometheus/applicationset.yaml

- /clusters
  - /vc-non.yaml
  - /vc-prod.yaml

- /appofapps
  - /namespaces/application.yaml
- /apps
  - /example/applicationset.yaml
  - /example/base/Chart.yaml
- /namespaces
  - /example/namespace.yaml
  - /example/resourcequota.yaml
  - /example/servicemesh.yaml

- /appofapps
  - /namespaces/application.yaml
- /apps
  - /example/applicationset.yaml
  - /example/base/Chart.yaml
- /namespaces
  - /example/namespace.yaml
  - /example/resourcequota.yaml
  - /example/servicemesh.yaml

daytwo automates several steps needed when first deploying clusters:

  • register cluster with argocd, also adds annotation allowing applications to target by cluster name
  • copy labels from cluster yaml to argocd secret, useful for deploying addons
  • generates pinniped kubeconfig, allows for initial access without needing admin kubeconfig
  • registers as a kasten secondary cluster, (if kasten is being used)

Scripts / pipelines are needed to:

  • provision / decommission a cluster
    • adjust cluster resources
  • add / remove a namespace
    • adjust namespace resource quota
    • grant developers access to namespaces

(idea) restarting a pod via webapi

At the end of a pipeline it can be nice to restart a pod.  In production this occurs via gitops automatically, but in dev it can be nice to not have to wait for git to sync.  Though there are multiple ways to do this common ways are:

  1. just put the cluster kubeconfig in a secret in the pipeline and use that to restart the pod (a little too powerful)
  2. create a service account, acquire its token and place in pipeline (safest, but takes some work for each app)

What about a solution similar to reloader?  Reloader watches for changes in configmaps and secrets which have a certain annotation and if a they change reloader restarts things.  We could just use reloader & make a change to a configmap in order to trigger the reload.

However, what about creating a controller which listens for a webapi call asking for a deployment to be restarted.  Then a pipeline could call the appropriate url to get things restarted.  By deploying via argocd using an applicationset and using a url convention based on the clustername then all development clusters could be enabled to used this method in their pipelines, consumers would only need to annotate their deployments/statefulset/etc …

flutter webapp, securely calling a backend

Just thinking outloud,

Since a flutter webapp is all running in the client browser, it is not possible to access a backend which requires credentials in some commonly used methods safely.

  • Loading credentials via environment variables, in the way containers commonly do, isn’t safe because the .env file containing the environment variables can be browsed directly.
  • Even if you are able to somehow get the credentials into the app, if they are credentials you don’t want the user to know, they can be exposed via dev tools … as everything is living in the client browser.

So how to connect to a web service backend from flutter?

You have to use an in-between backend, here are some options:

  • Implement a webapi which has methods created just for the flutter app
  • Implement a webapi with the intention of just passing the request along the backend and adding a header with the needed token, but also checking the request to be sure its only the type of request we want to allow.
  • Create an ingress passthrough which adds an appropriate token header and then calls the backend, careful though, does the token give the user too much access?

Note, this in-between webapi must be reachable from the client web browser, so it most likely must be protected, OIDC is a good option. Using the same OIDC parameters on both the flutter webapp and the in-between webapi will let an OIDC token gathered up via the webapp be passed along to the webapi without an additional login.

daytwo is almost ready for beta testing (argocd-daytwo)

Ability to use FQDN to access workload clusters, rather than ip address

Doing my part to try and help gitops take over the world.

By targeting applications to install on a kubernetes cluster by fqdn, rather than ip address, it becomes possible to delete a cluster and stand up a new cluster, and watch as all the applications reinstall automatically to the new cluster via gitops using something like argocd.

Feature request submitted (vcenter clusterapi provider):

If implemented, this would enable all consumers of vmware vcenter / tanzu to better implement gitops.

Tempting to drop everything and be the one to implement the code and submit a pull request, just to get the credit.  Would feel good to know you’ve helped so many people.

Creating webapps w/ flutter & deploying to kubernetes

In the past I’ve written guis in many languages, but found my passion more in backend server apis.  Along this line REST, swagger, OIDC, and websockets have been a dream come true.

Then one day I discovered flutter, created by google, it is cross platform compiling natively on both Android and iPhone.

Flutter changed everything for me, it made me love creating guis, it felt like the first real solution to creating a gui with everything else prior being trail blazer projects helping to figure out everything needed to one day lead to the creation of flutter.

Turns out it also can be used to create windows & linux apps as well, with some exceptions.  On a mobile device you get a free database with your application, so if you deploy on windows / linux you have to solve the database on your own.

Today I noticed that flutter can also be used to create a webapp.  I’m thinking to create a webapp, place it inside a container and deploy it to kubernetes along with a database into the same namespace, making for a similar situation to how mobile apps get a database.  So much potential, this is exciting.

kubernetes daytwo controllers

Thinking again about daytwo operations (controllers watching for cluster events):


  • watches for cluster.yaml (tanzu, clusterapi, etc…) and registers clusters with argocd automatically once they are in ‘ready’ state
  • syncs ‘addons’ labels from cluster.yaml to argocd cluster secrets to auto install addons, including pinniped-concierge, and pinniped-www


  • generates a pinniped kubeconfig if new cluster
  • adds to git repo if new cluster, removes from git repo if cluster decommissioned
  • regenerates configmap used by pinniped-www


  • watch for service associated with cluster to appear & annotate with fqdn, goal: to add a dns entry for each cluster kubeapi

Or perhaps, if desired, the event could trigger automation somewhere else:


  • callback to jenkins
  • callback to awx
  • callback to vmware-aria
  • etc …