!JupyterHub allows users to interact with a computing environment through a webpage. As most devices have access to a web browser, !JupyterHub makes it is easy to provide and standardize the computing environment of a group of people (e.g., for a class of students or an analytics team). Most common use of !JupyterHub is to launch !JupyterLab via !KubeSpawner once a user authenticates.
Moreover, one can create configurations for multiple user environments, and let users select from them once they log in to !JupyterHub. This is done by creating multiple profiles, each of which is attached to a set of configuration options that override !JupyterHub’s default configuration (specified in a Helm Chart). This can be used to let users choose among many Docker images, to select the hardware on which they want their jobs to run, or to configure default interfaces such as !JupyterLab vs. Jupyter Notebook. Step-by-step guide
!JupyterHub for Kubernetes is provided in a form of Helm’s chart. Helm is a Kubernetes package manager similar to yum or apt. Therefore, the first step was to install Helm. Helm consists of a server (tiller) and a client (helm). Tiller once deployed in Kubernetes cluster would run as a pod but for one-off installation of !JupyterHub, I decided not to install it into Kubernetes but just run it from the console on one of the kubernetes cluster minion nodes. I followed the instructions published in https://zero-to-!JupyterHub.readthedocs.io/en/latest/setup-helm.html but only run
helm init --client-only
DID NOT RUN helm init --service-account tiller --wait
Started tiller locally from the terminal:
/home/esportz/linux-amd64/tiller
and then
export HELM_HOST=localhost:44134
It appears that in later versions of helm (>=3.0) tiller component is removed and binary helm all is needed to install charts (no need to run helm init, etc)
The other prerequisite for running !JupyterHub is Kubernetes Dynamic Volume Provisioning. I chose nfs-client provisioner and installed it as described in https://github.com/kubernetes-incubator/external-storage/tree/master/nfs-client
Basically I modified the external-storage-master/nfs-client/deploy/deployment.yaml file:
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-client-provisioner
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: nfs-client-provisioner
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: quay.io/external_storage/nfs-client-provisioner:latest
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: netapp/nfs-client
- name: NFS_SERVER
value: nfs_server
- name: NFS_PATH
value: /export/volumes
volumes:
- name: nfs-client-root
nfs:
server: nfs_server
path: /export/volumesThen set the !StorageClass as default in external-storage-master/nfs-client/deploy/class.yaml like this:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: managed-nfs-storage
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: netapp/nfs-client # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
archiveOnDelete: "false"
allowVolumeExpansion: trueCreate RBAC:
kubectl create -f external-storage-master/nfs-client/deploy/rbac.yaml
Then
kubectl apply -f external-storage-master/nfs-client/deploy/class.yaml kubectl apply -f external-storage-master/nfs-client/deploy/deployment.yaml
Then tested as described in the above URL.
Lastly, I downloaded Helm’s chart !JupyterHub-0.8.2.tgz and followed instructions on https://zero-to-JupyterHub.readthedocs.io/en/latest/setup-JupyterHub.html to install !JupyterHub.
Created jhub-config-ldap.yaml
proxy:
secretToken: "<secret-token>"
service:
type: ClusterIP
https:
enabled: true
type: manual
manual:
# key: |
# <data from cert/conda.example.com.key>
# cert: |
# <data from cert/conda.example.com.crt>
prePuller:
pause:
image:
name: docker.com/google_containers/pause
scheduling:
userScheduler:
image:
name: docker.com/google_containers/kube-scheduler-amd64
singleuser:
defaultUrl: "/lab"
image:
name: conda
tag: '1.2'
events: false
hub:
extraConfig:
jupyterlab: |
c.Spawner.cmd = ['jupyter-labhub']
auth:
type: ldap
ldap:
server:
address: adc.example.com
dn:
lookup: True
search:
filter: '({login_attr}={login})'
user: 'ctera-ldap-svc'
password: '<password-from-the-vault>'
templates:
- 'CN={username},OU=ITO,OU=Users,OU=NY,OU=US,OU=Offices,DC=example,DC=com'
user:
searchBase: 'ou=Offices,dc=example,dc=com'
escape: False
attribute: 'sAMAccountName'
dnAttribute: 'cn'
allowedGroups:
- 'cn=admins,ou=users,ou=Security Groups,ou=Groups,dc=example,dc=com'
ingress:
enabled: true
annotations:
ingress.kubernetes.io/ssl-passthrough: "true"
hosts:
- conda.example.comWe must include the templates of all users! Otherwise, if a user whose template is not included, will get “invalid password” error! The problem with https://github.com/!jupyterhub/ldapauthenticator/blob/master/ldapauthenticator/ldapauthenticator.py is that they form DN by concatenating CN and either searchBase or a template and then bind with that DN using a password. Instead they should bind with just sAMAccountName or if they really want to use DN, then it is available in a pure form in distinguishedName attribute.
Then ran
linux-amd64/helm upgrade --install jhub jupyterhub-0.8.2.tgz -f jhub-config-ldap.yaml Release "jhub" does not exist. Installing it now. NAME: jhub LAST DEPLOYED: Tue Aug 27 23:05:06 2019 NAMESPACE: default STATUS: DEPLOYED RESOURCES: ==> v1/ConfigMap NAME DATA AGE hub-config 1 21s ==> v1/Deployment NAME READY UP-TO-DATE AVAILABLE AGE hub 0/1 1 0 21s proxy 0/1 1 0 21s ==> v1/PersistentVolumeClaim NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE hub-db-dir Bound pvc-8608fc30-7478-42f9-8f05-d6907896138c 1Gi RWO managed-nfs-storage 21s ==> v1/Pod(related) NAME READY STATUS RESTARTS AGE hub-65b79f57b6-8lnk7 0/1 ContainerCreating 0 21s proxy-79f45b4bb4-4b6zr 0/1 ContainerCreating 0 21s ==> v1/Role NAME AGE hub 21s ==> v1/RoleBinding NAME AGE hub 21s ==> v1/Secret NAME TYPE DATA AGE hub-secret Opaque 2 21s ==> v1/Service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hub ClusterIP 10.100.254.40 <none> 8081/TCP 21s proxy-api ClusterIP 10.108.146.3 <none> 8001/TCP 21s proxy-public LoadBalancer 10.102.222.146 <pending> 80:32522/TCP,443:32022/TCP 21s ==> v1/ServiceAccount NAME SECRETS AGE hub 1 21s ==> v1/StatefulSet NAME READY AGE user-placeholder 0/0 21s ==> v1beta1/PodDisruptionBudget NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE hub 1 N/A 0 21s proxy 1 N/A 0 21s user-placeholder 0 N/A 0 21s user-scheduler 1 N/A 0 21s NOTES: Thank you for installing JupyterHub! Your release is named jhub and installed into the namespace default. You can find if the hub and proxy is ready by doing: kubectl --namespace=default get pod and watching for both those pods to be in status 'Ready'. You can find the public IP of the JupyterHub by doing: kubectl --namespace=default get svc proxy-public It might take a few minutes for it to appear! Note that this is still an alpha release! If you have questions, feel free to 1. Read the guide at https://z2jh.jupyter.org 2. Chat with us at https://gitter.im/jupyterhub/jupyterhub 3. File issues at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues
Since our cluster did not have public IPs and ingress had not been enabled, I had to configure haproxy to point to http NodePort 32522:
frontend sni acl conda req.ssl_sni -i conda.example.com use_backend conda if conda backend conda timeout server 86400000 server localhost 127.0.0.1:3443 frontend conda-https timeout client 86400000 bind :3443 ssl crt /opt/haproxy/etc/conda.example.com.crt default_backend conda-servers backend conda-servers timeout server 86400000 balance source server ds-kube-minion-1.example.com 10.161.124.23:32522 check server ds-kube-minion-2.example.com 10.161.124.24:32522 check
The above haproxy config is no longer necessary because of k8s ingress:
frontend sni
timeout client 86400000
bind :443
tcp-request inspect-delay 5s
tcp-request content accept if { req_ssl_hello_type 1 }
default_backend kubernetes-ingress backend kubernetes-ingress
timeout server 86400000
server ds-kube-minion-1.lat.internal 10.161.124.23:443 check-ssl verify noneImportant!
!JupyterHub proxy deployment is now processing ssl traffic via k8s ingress (conda-ssl-ingress) with tls secret (proxy-manual-tls) mounted via volumeMounts (tls-secret). k8s service name is proxy-public.
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "4"
creationTimestamp: "2019-08-28T03:05:07Z"
generation: 4
labels:
app: jupyterhub
chart: jupyterhub-0.8.2
component: proxy
heritage: Tiller
release: jhub
name: proxy
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: jupyterhub
component: proxy
release: jhub
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
checksum/hub-secret: fc1435556e4b18bf057d25296940f1dbc49e803339faaddb0fbd94e34a7a1b88
checksum/proxy-secret: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
creationTimestamp: null
labels:
app: jupyterhub
component: proxy
hub.jupyter.org/network-access-hub: "true"
hub.jupyter.org/network-access-singleuser: "true"
release: jhub
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: hub.jupyter.org/node-purpose
operator: In
values:
- core
weight: 100
containers:
- command:
- configurable-http-proxy
- --ip=0.0.0.0
- --api-ip=0.0.0.0
- --api-port=8001
- --default-target=http://$(HUB_SERVICE_HOST):$(HUB_SERVICE_PORT)
- --error-target=http://$(HUB_SERVICE_HOST):$(HUB_SERVICE_PORT)/hub/error
- --port=8443
- --redirect-port=8000
- --ssl-key=/etc/chp/tls/tls.key
- --ssl-cert=/etc/chp/tls/tls.crt
env:
- name: CONFIGPROXY_AUTH_TOKEN
valueFrom:
secretKeyRef:
key: proxy.token
name: hub-secret
image: jupyterhub/configurable-http-proxy:4.1.0
imagePullPolicy: IfNotPresent
name: chp
ports:
- containerPort: 8443
name: proxy-https
protocol: TCP
- containerPort: 8000
name: proxy-public
protocol: TCP
- containerPort: 8001
name: api
protocol: TCP
resources:
requests:
cpu: 200m
memory: 512Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/chp/tls
name: tls-secret
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 60
volumes:
- name: tls-secret
secret:
defaultMode: 420
secretName: proxy-manual-tls
---
apiVersion: v1
kind: Service
metadata:
labels:
app: jupyterhub
chart: jupyterhub-0.8.2
component: proxy-public
heritage: Tiller
release: jhub
name: proxy-public
namespace: default
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8000
- name: https
port: 443
protocol: TCP
targetPort: 8443
selector:
component: proxy
release: jhub
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: conda-ssl-ingress
annotations:
ingress.kubernetes.io/ssl-passthrough: "true"
namespace: default
spec:
rules:
- host: conda.example.com
http:
paths:
- backend:
serviceName: proxy-public
servicePort: 443