kargo for stage promotion

Gitops is missing something, it doesn’t define how to promote between stages such as dev -> test -> prod, luckily kargo exists to fill the gap.  Kargo just works, makes it so pipelines can be greatly simplified, and it looks beautiful too.

Screenshot from a home project:

They’ll learn kubernetes on the fly

To think that making kubernetes available and giving your developers access will result in apps deployed to kubernetes, might be something that you need to rethink.

When folks are thinking about introducing kubernetes into the environment, instead of asking, “should we go with a cloud provider” or “onprem”, instead think about who it is that will be deploying things into the kubernetes environment. Will developers build containers which will be deployed into the environment? If so, will they deploy those containers into kubernetes or will another team do so on their behalf?

It is best if developers have awareness of kubernetes, if that’s the environment being used; similar to if you always deploy to windows but now you are going to be deploying to linux it would be good for the developers to have some awareness of linux.

Kubernetes tends to be a dream environment for developers to deploy to, they’ll love it, they will easily be able to perform deployment things themselves & create automation around deployments, infrastructure as code and gitops will become a modern reality, but they need to get up to speed with kubernetes for awhile before they will start to see how great it is. In the meantime they may hesitate, instead choosing to use tools they are already familiar with.

A good class will help to onboard folks to kubernetes, but a bad class might make employees want to run away. (a good intro class is the CKAD class on udemy by Mumshad Mannambeth, with a new email it should be < $30)

Kubernetes is not difficult, in the same way that writing code is not difficult, but it is a skill set. You wouldn’t hire someone without programming experience and just expect them to pick it up on the fly would you?

Careful that you don’t end up in the situation where your kubernetes administrators are the only folks who know how to deploy to kubernetes.

go http.Server example: clean exit with pod delete

	// wait group to ensure a clean exit
	wg sync.WaitGroup

	log.Println("start listener")

	// wrap entire mux with logging, bearer token, and apiKey middleware
	wrappedMux := NewEnsureAuth(mux)

	// run via a goroutine
	wg.Add(1)
	go func() {
		defer wg.Done()

		// create a 'server' so we can later use 'shutdown'
		srv := &http.Server{
			Addr:    ":8080",
			Handler: wrappedMux,
		}

		// handle SIGINT / SIGTERM
		go func() {
			sigint := make(chan os.Signal)
			signal.Notify(sigint, os.Interrupt, syscall.SIGTERM)
			<-sigint

			log.Println("catch SIGTERM")

			// cancel routine during shutdown, if needed
			sigctx, sigcancel := context.WithTimeout(context.Background(), 20*time.Second)
			defer func() {
				// extra handling here
				sigcancel()
			}()

			// stop listening, finish calls in progress, then exit
			if err := srv.Shutdown(sigctx); err != nil {
				log.Fatalf("- shutdown handler: %+v", err)
			}
			log.Print("- shutdown handler exited")
		}()

		// begin handling connections
		if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
			log.Fatalf("listen: %s\n", err)
		}
	}()

	wg.Wait()

Architecting scalable Kubernetes applications

Bottleneck

  • the database used will be the bottleneck in this design, however, databases can scale, so as requirements lead to billions of rows being effected on a regular basis it may be time to scale, of course you may want to address this sooner than later

Use cases

  • more than one oidc provider (multi-tenant)
  • one entity per tenant
    • identify by email
  • all dates & times are in UTC
  • if using sql use INDEX where appropriate
  • implements one server, e.g. ‘settings’ refers to one server, if more than one server is needed deploy a second server via k8s, similar to how you would deploy a wordpress instance for each website rather than a single wordpress instance to host all websites
  • (optional) implementing an application to run on more than one kubernetes cluster such as an onprem k8s instance as well as in a cloud k8s instance is mentioned though not in detail in this article; it can be done using a single endpoint such as f5 which is setup to loadbalance between the implementations in the two or more different k8s instances, additionally the database servers involved and messaging queue servers would need to be setup in such a way as to safely replicate between the two or more different k8s instances; or the database servers & messaging queue servers could live outside of the k8s implementations (perhaps provided by a third party) making them available to all the k8s implementations

Tables

  • entity
    • id (primary_key)
    • tenant (oidc provider: e.g. keycloak, gmail, ldap, etc…)
    • account (unique identifier across all tenants, e.g. email)
  • entity_persona (optional)
    • id (primary_key)
    • entityId
    • source (pc, mobile app, api access)
    • other (e.g. it may be useful to have a different profile pic based on source, or if there are multiple clients per user with api access such as stock trading bots with different trading algorithms owned by the same user)
  • auth (in response to an oidc login, or api token request)
    • id (primary_key)
    • entityId
    • granted (timestamp the token was generated)
    • seconds (value related to bearer token expiry, or in the case of an api token, value of ‘0’ to indicate the token never expires or similar expiry values)
    • state (‘active’, ‘revoked’)
    • token
    • persona (if desired)
  • auth_claims (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
    • value
  • auth_groups (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
  • settings (server settings, requires ‘admin’ access to set, but can be read by all)
    • id (primary_key)
    • name
    • value

REST API

Overview

All endpoints expect an apiKey token except /login (optional) and /token.  To acquire an apiKey token:

  1. Perform a GET /token or POST /token which will result in an oidc workflow, upon success you will receive an apiKey token.

Optional:  If desired, all endpoints can be configured to handle either a bearer token or an apiKey token, however, as would be expected, this will require an extra check with each endpoint access.

Swagger

  • implement swagger auto-generation via code comments in order to make consuming the api easy for developers

Rest Api

All REST APIs implemented using port 80 without TLS, instead TLS is provided via a Kubernetes Ingress or Kubernetes-based Mutual TLS (mTLS) solution (optional).

  • /login [GET] (begins the oidc process to login, upon success you get a bearer token & possibly a refresh token) (optional)
  • /token [GET / POST / PUT / DELETE] (begins the oidc process to login, upon success you get an apiKey token, or revoke/delete an existing apiKey token)
  • /callback (after oidc workflow completes this is the callback url used by the oidc provider)
  • /settings [GET / POST / PUT / DELETE] (server global-level settings, requires admin access to add/update/delete)
  • /ws (websocket for event listening, so that clients do not need to poll for updates)
  • /… additional app specific endpoints

Kubernetes Scaling

Overview

  • Use an Ingress or LoadBalancer to allow one or more “gateway/api” servers to handle a REST API call
  • Within the REST API implementation use a messaging server such as RabbitMQ to add the request to a messaging queue, then if appropriate either return immediately with a response that the request has been submitted, or wait for a response from an eventing queue indicating the work has been completed and return an appropriate REST API response.
  • Use one or more “implementation/api” servers which watch messaging queues for work to complete, perform the needed work & provide an indication that the work has been completed.  Implement an algorithm which allows “implementation/api” servers to “checkout” work to complete in a simple way such that if the “implementation/api” server which “checkedout” the work crashes, that the work will be picked up by another “implementation/api” server.
  • Denial Of Service (DOS) attack protection to be provided by third-party
  • api rate limit access may be desired and implemented at the REST API level
  • Optional: Allow “gateway/api” servers & “implementation/api” servers to live on more than one kubernetes cluster (see optional section in ‘Use cases’ at this top of this article)

Workflow

“implementation/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • read from message queue(s)
  • perform work & interact with database in a thread-safe way
  • upon completion submit message to appropriate message queue(s)

“gateway/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • implements REST API methods
  • submit(s) message(s) to appropriate message queue(s)
  • if needed, watches message queue(s) for a response to request(s)
  • returns appropriate response to REST API methods

Recommendations

https://gateway-api.sigs.k8s.io/

(idea) ssh menu and ssh menu webapi

Single linux system:
When you are using linux and need to ssh, instead of typing ssh you type ‘s’, which displays a menu of past ssh connection you can connect to, sorted by name or by most recently used, ability to adjust sort, edit, delete, etc.

Multiple linux systems:
I’m not the first to have this idea, there are a few sshmenu programs out there. But what if we were to create a webapi (and host it via kubernetes of course), with OIDC enabled, then on any linux system when you type ‘s’ it will ask for the sshmenu server, or use a default server from a config file, and pop up a web browser for you to login, obtain an apikey, and use that apikey from that point on to display your history off ssh connection on all your linux systems (keeping the sshmenu in sync on all of them). Could have a mobile app and web browser app as well.

Enterprise ready:
Using groups claims it would be possible to have admin users and regular users of course, but also a user could be added to a group and then they’ll also be able to see all of the group’s shared ssh connections, so a team could share out all their common ssh connections.

(gitops) argocd phoenix configuration: clusterapi with vcluster provider

Standardized git repo layouts helps to keep deployments consistent and clean:

k-argocd
- /appofapps/clusters/application.yaml
- /apps
  - /argocd-seed/update.sh
  - /argocd/applicationset.yaml
  - /clusterapi/applicationset.yaml
  - /daytwo/applicationset.yaml
- /projects
  - /addons.yaml
  - /developer.yaml
  - /devsecops.yaml

k-argocd-addons
- /apps
  - /adcs-issuer-system/applicationset.yaml
  - /adcs-issuer-system/base/Chart.yaml
  - /cert-manager/applicationset.yaml
  - /external-dns/applicationset.yaml
  - /external-dns-root/applicationset.yaml
  - /fluent-bit/applicationset.yaml
  - /kasten/applicationset.yaml
  - /nginx-ingress/applicationset.yaml
  - /metrics-server/applicationset.yaml
  - /pinniped-concierge/applicationset.yaml
  - /prometheus/applicationset.yaml

k-argocd-clusters
- /clusters
  - /vc-non.yaml
  - /vc-prod.yaml

k-vc-non
- /appofapps
  - /namespaces/application.yaml
- /apps
  - /example/applicationset.yaml
  - /example/base/Chart.yaml
- /namespaces
  - /example/namespace.yaml
  - /example/resourcequota.yaml
  - /example/servicemesh.yaml

k-vc-prod
- /appofapps
  - /namespaces/application.yaml
- /apps
  - /example/applicationset.yaml
  - /example/base/Chart.yaml
- /namespaces
  - /example/namespace.yaml
  - /example/resourcequota.yaml
  - /example/servicemesh.yaml

daytwo automates several steps needed when first deploying clusters:

  • register cluster with argocd, also adds annotation allowing applications to target by cluster name
  • copy labels from cluster yaml to argocd secret, useful for deploying addons
  • generates pinniped kubeconfig, allows for initial access without needing admin kubeconfig
  • registers as a kasten secondary cluster, (if kasten is being used)

Scripts / pipelines are needed to:

  • provision / decommission a cluster
    • adjust cluster resources
  • add / remove a namespace
    • adjust namespace resource quota
    • grant developers access to namespaces

(idea) restarting a pod via webapi

At the end of a pipeline it can be nice to restart a pod.  In production this occurs via gitops automatically, but in dev it can be nice to not have to wait for git to sync.  Though there are multiple ways to do this common ways are:

  1. just put the cluster kubeconfig in a secret in the pipeline and use that to restart the pod (a little too powerful)
  2. create a service account, acquire its token and place in pipeline (safest, but takes some work for each app)

What about a solution similar to reloader?  Reloader watches for changes in configmaps and secrets which have a certain annotation and if a they change reloader restarts things.  We could just use reloader & make a change to a configmap in order to trigger the reload.

However, what about creating a controller which listens for a webapi call asking for a deployment to be restarted.  Then a pipeline could call the appropriate url to get things restarted.  By deploying via argocd using an applicationset and using a url convention based on the clustername then all development clusters could be enabled to used this method in their pipelines, consumers would only need to annotate their deployments/statefulset/etc …

(bash) way to implement an idle processing loop

Bash is what it is, often a quick solution to get something done, once things start to become too advance probably should be writing in a higher level language.

With that said, here’s using tcp via bash to implement an idle processing loop:

#!/bin/bash

# Start tail command in a separate process and track PID so we can take action later
# (send output to this shell's stdout)
tail /var/logs/* > /proc/$$/fd/1 &
PID_TAIL=$!

# Idle processing loop
while (true); do

  # Perform work while 
  if <test condition>; then
  
    # Stop the tail process
    disown $PID_TAIL;
    kill $PID_TAIL;

    # Exit idle processing loop
    break;
  fi

  # avoiding maxing cpu
  sleep 1

done

The trick here is ‘$$’ is the current shell’s PID, the /proc/$$/fd/1 is a tcp device which represents the shell’s stdout. By sending the tail command to this device as a background process with ‘&’ we still see the tail output on our console. Yet, the bash script is actually in a loop able to do whatever it wants while the tail is running. When the script sees some condition it cares about it can stop the tail process and exit, or just keep working until someone presses Ctrl-c.

For more information see: https://www.xmodulo.com/tcp-udp-socket-bash-shell.html

flutter webapp, securely calling a backend

Just thinking outloud,

Since a flutter webapp is all running in the client browser, it is not possible to access a backend which requires credentials in some commonly used methods safely.

  • Loading credentials via environment variables, in the way containers commonly do, isn’t safe because the .env file containing the environment variables can be browsed directly. https://github.com/java-james/flutter_dotenv/issues/74
  • Even if you are able to somehow get the credentials into the app, if they are credentials you don’t want the user to know, they can be exposed via dev tools … as everything is living in the client browser.

So how to connect to a web service backend from flutter?

You have to use an in-between backend, here are some options:

  • Implement a webapi which has methods created just for the flutter app
  • Implement a webapi with the intention of just passing the request along the backend and adding a header with the needed token, but also checking the request to be sure its only the type of request we want to allow.
  • Create an ingress passthrough which adds an appropriate token header and then calls the backend, careful though, does the token give the user too much access?

Note, this in-between webapi must be reachable from the client web browser, so it most likely must be protected, OIDC is a good option. Using the same OIDC parameters on both the flutter webapp and the in-between webapi will let an OIDC token gathered up via the webapp be passed along to the webapi without an additional login.

daytwo is almost ready for beta testing (argocd-daytwo)