kargo for stage promotion

Gitops is missing something, it doesn’t define how to promote between stages such as dev -> test -> prod, luckily kargo exists to fill the gap.  Kargo just works, makes it so pipelines can be greatly simplified, and it looks beautiful too.

Screenshot from a home project:

They’ll learn kubernetes on the fly

To think that making kubernetes available and giving your developers access will result in apps deployed to kubernetes, might be something that you need to rethink.

When folks are thinking about introducing kubernetes into the environment, instead of asking, “should we go with a cloud provider” or “onprem”, instead think about who it is that will be deploying things into the kubernetes environment. Will developers build containers which will be deployed into the environment? If so, will they deploy those containers into kubernetes or will another team do so on their behalf?

It is best if developers have awareness of kubernetes, if that’s the environment being used; similar to if you always deploy to windows but now you are going to be deploying to linux it would be good for the developers to have some awareness of linux.

Kubernetes tends to be a dream environment for developers to deploy to, they’ll love it, they will easily be able to perform deployment things themselves & create automation around deployments, infrastructure as code and gitops will become a modern reality, but they need to get up to speed with kubernetes for awhile before they will start to see how great it is. In the meantime they may hesitate, instead choosing to use tools they are already familiar with.

A good class will help to onboard folks to kubernetes, but a bad class might make employees want to run away. (a good intro class is the CKAD class on udemy by Mumshad Mannambeth, with a new email it should be < $30)

Kubernetes is not difficult, in the same way that writing code is not difficult, but it is a skill set. You wouldn’t hire someone without programming experience and just expect them to pick it up on the fly would you?

Careful that you don’t end up in the situation where your kubernetes administrators are the only folks who know how to deploy to kubernetes.

go http.Server example: clean exit with pod delete

	// wait group to ensure a clean exit
	wg sync.WaitGroup

	log.Println("start listener")

	// wrap entire mux with logging, bearer token, and apiKey middleware
	wrappedMux := NewEnsureAuth(mux)

	// run via a goroutine
	wg.Add(1)
	go func() {
		defer wg.Done()

		// create a 'server' so we can later use 'shutdown'
		srv := &http.Server{
			Addr:    ":8080",
			Handler: wrappedMux,
		}

		// handle SIGINT / SIGTERM
		go func() {
			sigint := make(chan os.Signal)
			signal.Notify(sigint, os.Interrupt, syscall.SIGTERM)
			<-sigint

			log.Println("catch SIGTERM")

			// cancel routine during shutdown, if needed
			sigctx, sigcancel := context.WithTimeout(context.Background(), 20*time.Second)
			defer func() {
				// extra handling here
				sigcancel()
			}()

			// stop listening, finish calls in progress, then exit
			if err := srv.Shutdown(sigctx); err != nil {
				log.Fatalf("- shutdown handler: %+v", err)
			}
			log.Print("- shutdown handler exited")
		}()

		// begin handling connections
		if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
			log.Fatalf("listen: %s\n", err)
		}
	}()

	wg.Wait()

Architecting scalable Kubernetes applications

Bottleneck

  • the database used will be the bottleneck in this design, however, databases can scale, so as requirements lead to billions of rows being effected on a regular basis it may be time to scale, of course you may want to address this sooner than later

Use cases

  • more than one oidc provider (multi-tenant)
  • one entity per tenant
    • identify by email
  • all dates & times are in UTC
  • if using sql use INDEX where appropriate
  • implements one server, e.g. ‘settings’ refers to one server, if more than one server is needed deploy a second server via k8s, similar to how you would deploy a wordpress instance for each website rather than a single wordpress instance to host all websites
  • (optional) implementing an application to run on more than one kubernetes cluster such as an onprem k8s instance as well as in a cloud k8s instance is mentioned though not in detail in this article; it can be done using a single endpoint such as f5 which is setup to loadbalance between the implementations in the two or more different k8s instances, additionally the database servers involved and messaging queue servers would need to be setup in such a way as to safely replicate between the two or more different k8s instances; or the database servers & messaging queue servers could live outside of the k8s implementations (perhaps provided by a third party) making them available to all the k8s implementations

Tables

  • entity
    • id (primary_key)
    • tenant (oidc provider: e.g. keycloak, gmail, ldap, etc…)
    • account (unique identifier across all tenants, e.g. email)
  • entity_persona (optional)
    • id (primary_key)
    • entityId
    • source (pc, mobile app, api access)
    • other (e.g. it may be useful to have a different profile pic based on source, or if there are multiple clients per user with api access such as stock trading bots with different trading algorithms owned by the same user)
  • auth (in response to an oidc login, or api token request)
    • id (primary_key)
    • entityId
    • granted (timestamp the token was generated)
    • seconds (value related to bearer token expiry, or in the case of an api token, value of ‘0’ to indicate the token never expires or similar expiry values)
    • state (‘active’, ‘revoked’)
    • token
    • persona (if desired)
  • auth_claims (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
    • value
  • auth_groups (in response to an oidc login)
    • id (primary_key)
    • authId
    • name
  • settings (server settings, requires ‘admin’ access to set, but can be read by all)
    • id (primary_key)
    • name
    • value

REST API

Overview

All endpoints expect an apiKey token except /login (optional) and /token.  To acquire an apiKey token:

  1. Perform a GET /token or POST /token which will result in an oidc workflow, upon success you will receive an apiKey token.

Optional:  If desired, all endpoints can be configured to handle either a bearer token or an apiKey token, however, as would be expected, this will require an extra check with each endpoint access.

Swagger

  • implement swagger auto-generation via code comments in order to make consuming the api easy for developers

Rest Api

All REST APIs implemented using port 80 without TLS, instead TLS is provided via a Kubernetes Ingress or Kubernetes-based Mutual TLS (mTLS) solution (optional).

  • /login [GET] (begins the oidc process to login, upon success you get a bearer token & possibly a refresh token) (optional)
  • /token [GET / POST / PUT / DELETE] (begins the oidc process to login, upon success you get an apiKey token, or revoke/delete an existing apiKey token)
  • /callback (after oidc workflow completes this is the callback url used by the oidc provider)
  • /settings [GET / POST / PUT / DELETE] (server global-level settings, requires admin access to add/update/delete)
  • /ws (websocket for event listening, so that clients do not need to poll for updates)
  • /… additional app specific endpoints

Kubernetes Scaling

Overview

  • Use an Ingress or LoadBalancer to allow one or more “gateway/api” servers to handle a REST API call
  • Within the REST API implementation use a messaging server such as RabbitMQ to add the request to a messaging queue, then if appropriate either return immediately with a response that the request has been submitted, or wait for a response from an eventing queue indicating the work has been completed and return an appropriate REST API response.
  • Use one or more “implementation/api” servers which watch messaging queues for work to complete, perform the needed work & provide an indication that the work has been completed.  Implement an algorithm which allows “implementation/api” servers to “checkout” work to complete in a simple way such that if the “implementation/api” server which “checkedout” the work crashes, that the work will be picked up by another “implementation/api” server.
  • Denial Of Service (DOS) attack protection to be provided by third-party
  • api rate limit access may be desired and implemented at the REST API level
  • Optional: Allow “gateway/api” servers & “implementation/api” servers to live on more than one kubernetes cluster (see optional section in ‘Use cases’ at this top of this article)

Workflow

“implementation/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • read from message queue(s)
  • perform work & interact with database in a thread-safe way
  • upon completion submit message to appropriate message queue(s)

“gateway/api” server(s)

(one or more servers, may be automatically scaled based on resource demand)

  • implements REST API methods
  • submit(s) message(s) to appropriate message queue(s)
  • if needed, watches message queue(s) for a response to request(s)
  • returns appropriate response to REST API methods

Recommendations

https://gateway-api.sigs.k8s.io/

(idea) ssh menu and ssh menu webapi

Single linux system:
When you are using linux and need to ssh, instead of typing ssh you type ‘s’, which displays a menu of past ssh connection you can connect to, sorted by name or by most recently used, ability to adjust sort, edit, delete, etc.

Multiple linux systems:
I’m not the first to have this idea, there are a few sshmenu programs out there. But what if we were to create a webapi (and host it via kubernetes of course), with OIDC enabled, then on any linux system when you type ‘s’ it will ask for the sshmenu server, or use a default server from a config file, and pop up a web browser for you to login, obtain an apikey, and use that apikey from that point on to display your history off ssh connection on all your linux systems (keeping the sshmenu in sync on all of them). Could have a mobile app and web browser app as well.

Enterprise ready:
Using groups claims it would be possible to have admin users and regular users of course, but also a user could be added to a group and then they’ll also be able to see all of the group’s shared ssh connections, so a team could share out all their common ssh connections.

(bash) script to fully clean up a kubernetes app

When experimenting with something like ceph, installing it, making changes, uninstalling and reinstalling … you will find that more advanced apps tend to implement finalizers making a full uninstall rather challenging.  More complex apps tend to have an uninstaller script for just this reason.  When lacking such a script though, here is a generic script which can take care of a lot, or all of the clean up work:

** Note, you are entering a danger zone. **

#!/bin/bash

if [ "$1" == "" ]
then
  echo "Syntax:"
  echo ""
  echo "$0 <searchstr>"

  exit 1
fi

CRDS=`kubectl get crd --no-headers | grep $1 | cut -d ' ' -f 1`

array_crds=( $CRDS )
for crd in "${array_crds[@]}"
do
  echo ""
  echo $crd
  RESOURCES=`kubectl get $crd --no-headers | grep $1 | cut -d ' ' -f 1`
  array_resources=( $RESOURCES )
  for next in "${array_resources[@]}"
  do
    echo $next
    kubectl patch $crd $next -p '{"metadata":{"finalizers":null}}' --type=merge
    kubectl delete $crd $next
  done
done

(gitops) argocd phoenix configuration: clusterapi with vcluster provider

Standardized git repo layouts helps to keep deployments consistent and clean:

k-argocd
- /appofapps/clusters/application.yaml
- /apps
  - /argocd-seed/update.sh
  - /argocd/applicationset.yaml
  - /clusterapi/applicationset.yaml
  - /daytwo/applicationset.yaml
- /projects
  - /addons.yaml
  - /developer.yaml
  - /devsecops.yaml

k-argocd-addons
- /apps
  - /adcs-issuer-system/applicationset.yaml
  - /adcs-issuer-system/base/Chart.yaml
  - /cert-manager/applicationset.yaml
  - /external-dns/applicationset.yaml
  - /external-dns-root/applicationset.yaml
  - /fluent-bit/applicationset.yaml
  - /kasten/applicationset.yaml
  - /nginx-ingress/applicationset.yaml
  - /metrics-server/applicationset.yaml
  - /pinniped-concierge/applicationset.yaml
  - /prometheus/applicationset.yaml

k-argocd-clusters
- /clusters
  - /vc-non.yaml
  - /vc-prod.yaml

k-vc-non
- /appofapps
  - /namespaces/application.yaml
- /apps
  - /example/applicationset.yaml
  - /example/base/Chart.yaml
- /namespaces
  - /example/namespace.yaml
  - /example/resourcequota.yaml
  - /example/servicemesh.yaml

k-vc-prod
- /appofapps
  - /namespaces/application.yaml
- /apps
  - /example/applicationset.yaml
  - /example/base/Chart.yaml
- /namespaces
  - /example/namespace.yaml
  - /example/resourcequota.yaml
  - /example/servicemesh.yaml

daytwo automates several steps needed when first deploying clusters:

  • register cluster with argocd, also adds annotation allowing applications to target by cluster name
  • copy labels from cluster yaml to argocd secret, useful for deploying addons
  • generates pinniped kubeconfig, allows for initial access without needing admin kubeconfig
  • registers as a kasten secondary cluster, (if kasten is being used)

Scripts / pipelines are needed to:

  • provision / decommission a cluster
    • adjust cluster resources
  • add / remove a namespace
    • adjust namespace resource quota
    • grant developers access to namespaces