!JupyterHub allows users to interact with a computing environment through a webpage. As most devices have access to a web browser, !JupyterHub makes it is easy to provide and standardize the computing environment of a group of people (e.g., for a class of students or an analytics team). Most common use of !JupyterHub is to launch !JupyterLab via !KubeSpawner once a user authenticates.
Moreover, one can create configurations for multiple user environments, and let users select from them once they log in to !JupyterHub. This is done by creating multiple profiles, each of which is attached to a set of configuration options that override !JupyterHub’s default configuration (specified in a Helm Chart). This can be used to let users choose among many Docker images, to select the hardware on which they want their jobs to run, or to configure default interfaces such as !JupyterLab vs. Jupyter Notebook. Step-by-step guide
!JupyterHub for Kubernetes is provided in a form of Helm’s chart. Helm is a Kubernetes package manager similar to yum or apt. Therefore, the first step was to install Helm. Helm consists of a server (tiller) and a client (helm). Tiller once deployed in Kubernetes cluster would run as a pod but for one-off installation of !JupyterHub, I decided not to install it into Kubernetes but just run it from the console on one of the kubernetes cluster minion nodes. I followed the instructions published in https://zero-to-!JupyterHub.readthedocs.io/en/latest/setup-helm.html but only run
helm init --client-only
DID NOT RUN helm init --service-account tiller --wait
Started tiller locally from the terminal:
/home/esportz/linux-amd64/tiller
and then
export HELM_HOST=localhost:44134
It appears that in later versions of helm (>=3.0) tiller component is removed and binary helm all is needed to install charts (no need to run helm init, etc)
The other prerequisite for running !JupyterHub is Kubernetes Dynamic Volume Provisioning. I chose nfs-client provisioner and installed it as described in https://github.com/kubernetes-incubator/external-storage/tree/master/nfs-client
Basically I modified the external-storage-master/nfs-client/deploy/deployment.yaml file:
apiVersion: v1 kind: ServiceAccount metadata: name: nfs-client-provisioner --- kind: Deployment apiVersion: extensions/v1beta1 metadata: name: nfs-client-provisioner spec: replicas: 1 strategy: type: Recreate template: metadata: labels: app: nfs-client-provisioner spec: serviceAccountName: nfs-client-provisioner containers: - name: nfs-client-provisioner image: quay.io/external_storage/nfs-client-provisioner:latest volumeMounts: - name: nfs-client-root mountPath: /persistentvolumes env: - name: PROVISIONER_NAME value: netapp/nfs-client - name: NFS_SERVER value: nfs_server - name: NFS_PATH value: /export/volumes volumes: - name: nfs-client-root nfs: server: nfs_server path: /export/volumes
Then set the !StorageClass as default in external-storage-master/nfs-client/deploy/class.yaml like this:
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: managed-nfs-storage annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: netapp/nfs-client # or choose another name, must match deployment's env PROVISIONER_NAME' parameters: archiveOnDelete: "false" allowVolumeExpansion: true
Create RBAC:
kubectl create -f external-storage-master/nfs-client/deploy/rbac.yaml
Then
kubectl apply -f external-storage-master/nfs-client/deploy/class.yaml kubectl apply -f external-storage-master/nfs-client/deploy/deployment.yaml
Then tested as described in the above URL.
Lastly, I downloaded Helm’s chart !JupyterHub-0.8.2.tgz and followed instructions on https://zero-to-JupyterHub.readthedocs.io/en/latest/setup-JupyterHub.html to install !JupyterHub.
Created jhub-config-ldap.yaml
proxy: secretToken: "<secret-token>" service: type: ClusterIP https: enabled: true type: manual manual: # key: | # <data from cert/conda.example.com.key> # cert: | # <data from cert/conda.example.com.crt> prePuller: pause: image: name: docker.com/google_containers/pause scheduling: userScheduler: image: name: docker.com/google_containers/kube-scheduler-amd64 singleuser: defaultUrl: "/lab" image: name: conda tag: '1.2' events: false hub: extraConfig: jupyterlab: | c.Spawner.cmd = ['jupyter-labhub'] auth: type: ldap ldap: server: address: adc.example.com dn: lookup: True search: filter: '({login_attr}={login})' user: 'ctera-ldap-svc' password: '<password-from-the-vault>' templates: - 'CN={username},OU=ITO,OU=Users,OU=NY,OU=US,OU=Offices,DC=example,DC=com' user: searchBase: 'ou=Offices,dc=example,dc=com' escape: False attribute: 'sAMAccountName' dnAttribute: 'cn' allowedGroups: - 'cn=admins,ou=users,ou=Security Groups,ou=Groups,dc=example,dc=com' ingress: enabled: true annotations: ingress.kubernetes.io/ssl-passthrough: "true" hosts: - conda.example.com
We must include the templates of all users! Otherwise, if a user whose template is not included, will get “invalid password” error! The problem with https://github.com/!jupyterhub/ldapauthenticator/blob/master/ldapauthenticator/ldapauthenticator.py is that they form DN by concatenating CN and either searchBase or a template and then bind with that DN using a password. Instead they should bind with just sAMAccountName or if they really want to use DN, then it is available in a pure form in distinguishedName attribute.
Then ran
linux-amd64/helm upgrade --install jhub jupyterhub-0.8.2.tgz -f jhub-config-ldap.yaml Release "jhub" does not exist. Installing it now. NAME: jhub LAST DEPLOYED: Tue Aug 27 23:05:06 2019 NAMESPACE: default STATUS: DEPLOYED RESOURCES: ==> v1/ConfigMap NAME DATA AGE hub-config 1 21s ==> v1/Deployment NAME READY UP-TO-DATE AVAILABLE AGE hub 0/1 1 0 21s proxy 0/1 1 0 21s ==> v1/PersistentVolumeClaim NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE hub-db-dir Bound pvc-8608fc30-7478-42f9-8f05-d6907896138c 1Gi RWO managed-nfs-storage 21s ==> v1/Pod(related) NAME READY STATUS RESTARTS AGE hub-65b79f57b6-8lnk7 0/1 ContainerCreating 0 21s proxy-79f45b4bb4-4b6zr 0/1 ContainerCreating 0 21s ==> v1/Role NAME AGE hub 21s ==> v1/RoleBinding NAME AGE hub 21s ==> v1/Secret NAME TYPE DATA AGE hub-secret Opaque 2 21s ==> v1/Service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hub ClusterIP 10.100.254.40 <none> 8081/TCP 21s proxy-api ClusterIP 10.108.146.3 <none> 8001/TCP 21s proxy-public LoadBalancer 10.102.222.146 <pending> 80:32522/TCP,443:32022/TCP 21s ==> v1/ServiceAccount NAME SECRETS AGE hub 1 21s ==> v1/StatefulSet NAME READY AGE user-placeholder 0/0 21s ==> v1beta1/PodDisruptionBudget NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE hub 1 N/A 0 21s proxy 1 N/A 0 21s user-placeholder 0 N/A 0 21s user-scheduler 1 N/A 0 21s NOTES: Thank you for installing JupyterHub! Your release is named jhub and installed into the namespace default. You can find if the hub and proxy is ready by doing: kubectl --namespace=default get pod and watching for both those pods to be in status 'Ready'. You can find the public IP of the JupyterHub by doing: kubectl --namespace=default get svc proxy-public It might take a few minutes for it to appear! Note that this is still an alpha release! If you have questions, feel free to 1. Read the guide at https://z2jh.jupyter.org 2. Chat with us at https://gitter.im/jupyterhub/jupyterhub 3. File issues at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues
Since our cluster did not have public IPs and ingress had not been enabled, I had to configure haproxy to point to http NodePort 32522:
frontend sni acl conda req.ssl_sni -i conda.example.com use_backend conda if conda backend conda timeout server 86400000 server localhost 127.0.0.1:3443 frontend conda-https timeout client 86400000 bind :3443 ssl crt /opt/haproxy/etc/conda.example.com.crt default_backend conda-servers backend conda-servers timeout server 86400000 balance source server ds-kube-minion-1.example.com 10.161.124.23:32522 check server ds-kube-minion-2.example.com 10.161.124.24:32522 check
The above haproxy config is no longer necessary because of k8s ingress:
frontend sni timeout client 86400000 bind :443 tcp-request inspect-delay 5s tcp-request content accept if { req_ssl_hello_type 1 } default_backend kubernetes-ingress
backend kubernetes-ingress timeout server 86400000 server ds-kube-minion-1.lat.internal 10.161.124.23:443 check-ssl verify none
Important!
!JupyterHub proxy deployment is now processing ssl traffic via k8s ingress (conda-ssl-ingress) with tls secret (proxy-manual-tls) mounted via volumeMounts (tls-secret). k8s service name is proxy-public.
apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "4" creationTimestamp: "2019-08-28T03:05:07Z" generation: 4 labels: app: jupyterhub chart: jupyterhub-0.8.2 component: proxy heritage: Tiller release: jhub name: proxy namespace: default spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: jupyterhub component: proxy release: jhub strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: annotations: checksum/hub-secret: fc1435556e4b18bf057d25296940f1dbc49e803339faaddb0fbd94e34a7a1b88 checksum/proxy-secret: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b creationTimestamp: null labels: app: jupyterhub component: proxy hub.jupyter.org/network-access-hub: "true" hub.jupyter.org/network-access-singleuser: "true" release: jhub spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: hub.jupyter.org/node-purpose operator: In values: - core weight: 100 containers: - command: - configurable-http-proxy - --ip=0.0.0.0 - --api-ip=0.0.0.0 - --api-port=8001 - --default-target=http://$(HUB_SERVICE_HOST):$(HUB_SERVICE_PORT) - --error-target=http://$(HUB_SERVICE_HOST):$(HUB_SERVICE_PORT)/hub/error - --port=8443 - --redirect-port=8000 - --ssl-key=/etc/chp/tls/tls.key - --ssl-cert=/etc/chp/tls/tls.crt env: - name: CONFIGPROXY_AUTH_TOKEN valueFrom: secretKeyRef: key: proxy.token name: hub-secret image: jupyterhub/configurable-http-proxy:4.1.0 imagePullPolicy: IfNotPresent name: chp ports: - containerPort: 8443 name: proxy-https protocol: TCP - containerPort: 8000 name: proxy-public protocol: TCP - containerPort: 8001 name: api protocol: TCP resources: requests: cpu: 200m memory: 512Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/chp/tls name: tls-secret readOnly: true dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 60 volumes: - name: tls-secret secret: defaultMode: 420 secretName: proxy-manual-tls --- apiVersion: v1 kind: Service metadata: labels: app: jupyterhub chart: jupyterhub-0.8.2 component: proxy-public heritage: Tiller release: jhub name: proxy-public namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8000 - name: https port: 443 protocol: TCP targetPort: 8443 selector: component: proxy release: jhub --- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: conda-ssl-ingress annotations: ingress.kubernetes.io/ssl-passthrough: "true" namespace: default spec: rules: - host: conda.example.com http: paths: - backend: serviceName: proxy-public servicePort: 443