IBM Cloud
Contents
IBM Cloud¶
|
Cluster running on IBM Code Engine. |
Overview¶
Authentication¶
To authenticate with IBM Cloud you must first generate an API key.
Then you must put this in your Dask configuration at cloudprovider.ibm.api_key. This can be done by
adding the API key to your YAML configuration or exporting an environment variable.
# ~/.config/dask/cloudprovider.yaml
cloudprovider:
ibm:
api_key: "your_api_key"
$ export DASK_CLOUDPROVIDER__IBM__API_KEY="your_api_key"
Project ID¶
To use Dask Cloudprovider with IBM Cloud you must also configure your Project ID. This can be found at the top of the IBM Cloud dashboard.
Your Project ID must be added to your Dask config file.
# ~/.config/dask/cloudprovider.yaml
cloudprovider:
ibm:
project_id: "your_project_id"
Or via an environment variable.
$ export DASK_CLOUDPROVIDER__IBM__PROJECT_ID="your_project_id"
Code Engine¶
- class dask_cloudprovider.ibm.IBMCodeEngineCluster(image: str = None, region: str = None, project_id: str = None, scheduler_cpu: str = None, scheduler_mem: str = None, scheduler_disk: str = None, scheduler_timeout: int = None, scheduler_command: str = None, worker_cpu: str = None, worker_mem: str = None, worker_disk: str = None, worker_threads: int = 1, worker_command: str = None, docker_server: str = None, docker_username: str = None, docker_password: str = None, debug: bool = False, **kwargs)[source]¶
Cluster running on IBM Code Engine.
This cluster manager builds a Dask cluster running on IBM Code Engine.
When configuring your cluster, you may find it useful to refer to the IBM Cloud documentation for available options.
https://cloud.ibm.com/docs/codeengine
- Parameters
- image: str
The Docker image to run on all instances. This image must have a valid Python environment and have
daskinstalled in order for thedask-scheduleranddask-workercommands to be available.- region: str
The IBM Cloud region to launch your cluster in.
See: https://cloud.ibm.com/docs/codeengine?topic=codeengine-regions
- project_id: str
Your IBM Cloud project ID. This must be set either here or in your Dask config.
- scheduler_cpu: str
The amount of CPU to allocate to the scheduler.
See: https://cloud.ibm.com/docs/codeengine?topic=codeengine-mem-cpu-combo
- scheduler_mem: str
The amount of memory to allocate to the scheduler.
See: https://cloud.ibm.com/docs/codeengine?topic=codeengine-mem-cpu-combo
- scheduler_disk: str
The amount of ephemeral storage to allocate to the scheduler. This value must be lower than scheduler_mem.
- scheduler_timeout: int
The timeout for the scheduler in seconds.
- scheduler_command: str
The command to run the scheduler. This should be a string that is passed to the
dask-schedulercommand. The default isdask-scheduler --protocol ws.- worker_cpu: str
The amount of CPU to allocate to each worker.
See: https://cloud.ibm.com/docs/codeengine?topic=codeengine-mem-cpu-combo
- worker_mem: str
The amount of memory to allocate to each worker.
See: https://cloud.ibm.com/docs/codeengine?topic=codeengine-mem-cpu-combo
- worker_disk: str
The amount of ephemeral storage to allocate to each worker. This value must be lower than worker_mem.
- worker_threads: int
The number of threads to use on each worker.
- worker_command: str
The command to run the worker. This should be a string that is passed to the
dask-workercommand. The default ispython -m distributed.cli.dask_spec.- docker_server: str
The Docker registry server (e.g., “docker.io”, “gcr.io”). Required if using private Docker images.
- docker_username: str
The username for authenticating with the Docker registry. Required if using private Docker images.
- docker_password: str
The password or access token for authenticating with the Docker registry. Required if using private Docker images.
- debug: bool, optional
More information will be printed when constructing clusters to enable debugging.
Notes
Credentials
In order to use the IBM Cloud API, you will need to set up an API key. You can create an API key in the IBM Cloud console.
The best practice way of doing this is to pass an API key to be used by workers. You can set this API key as an environment variable. Here is a small example to help you do that.
To expose your IBM API KEY, use this command: export DASK_CLOUDPROVIDER__IBM__API_KEY=xxxxx
Docker Registry Authentication
If you need to use private Docker images, you can configure Docker registry credentials using the docker_server, docker_username, and docker_password parameters. These credentials will be used to create a Kubernetes secret for image pulling in Code Engine.
Certificates
This backend will need to use a Let’s Encrypt certificate (ISRG Root X1) to connect the client to the scheduler between websockets. More information can be found here: https://letsencrypt.org/certificates/
Examples
Create the cluster.
>>> from dask_cloudprovider.ibm import IBMCodeEngineCluster >>> cluster = IBMCodeEngineCluster(n_workers=1) Launching cluster with the following configuration: Source Image: daskdev/dask:latest Region: eu-de Project id: f21626f6-54f7-4065-a038-75c8b9a0d2e0 Scheduler CPU: 0.25 Scheduler Memory: 1G Scheduler Disk: 400M Scheduler Timeout: 600 Worker CPU: 2 Worker Memory: 4G Worker Disk: 400M Creating scheduler dask-xxxxxxxx-scheduler Waiting for scheduler to run at dask-xxxxxxxx-scheduler.xxxxxxxxxxxx.xx-xx.codeengine.appdomain.cloud:443 Scheduler is running Creating worker instance dask-xxxxxxxx-worker-xxxxxxxx
>>> from dask.distributed import Client >>> client = Client(cluster)
Do some work.
>>> import dask.array as da >>> arr = da.random.random((1000, 1000), chunks=(100, 100)) >>> arr.mean().compute() 0.5001550986751964
Close the cluster
>>> cluster.close() Deleting Instance: dask-xxxxxxxx-worker-xxxxxxxx Deleting Instance: dask-xxxxxxxx-scheduler
You can also do this all in one go with context managers to ensure the cluster is created and cleaned up.
>>> with IBMCodeEngineCluster(n_workers=1) as cluster: ... with Client(cluster) as client: ... print(da.random.random((1000, 1000), chunks=(100, 100)).mean().compute()) Launching cluster with the following configuration: Source Image: daskdev/dask:latest Region: eu-de Project id: f21626f6-54f7-4065-a038-75c8b9a0d2e0 Scheduler CPU: 0.25 Scheduler Memory: 1G Scheduler Disk: 400M Scheduler Timeout: 600 Worker CPU: 2 Worker Memory: 4G Worker Disk: 400M Worker Threads: 1 Creating scheduler dask-xxxxxxxx-scheduler Waiting for scheduler to run at dask-xxxxxxxx-scheduler.xxxxxxxxxxxx.xx-xx.codeengine.appdomain.cloud:443 Scheduler is running Creating worker instance dask-xxxxxxxx-worker-xxxxxxxx 0.5000812282861661 Deleting Instance: dask-xxxxxxxx-worker-xxxxxxxx Deleting Instance: dask-xxxxxxxx-scheduler
- Attributes
asynchronousAre we running in the event loop?
- auto_shutdown
- bootstrap
- called_from_running_loop
- command
- dashboard_link
- docker_image
- gpu_instance
- loop
- name
- observed
- plan
- requested
- scheduler_address
- scheduler_class
- worker_class
Methods
adapt([Adaptive, minimum, maximum, ...])Turn on adaptivity
call_async(f, *args, **kwargs)Run a blocking function in a thread as a coroutine.
from_name(name)Create an instance of this class to represent an existing cluster by name.
get_client()Return client for the cluster
get_logs([cluster, scheduler, workers])Return logs for the cluster, scheduler and workers
get_tags()Generate tags to be applied to all resources.
new_worker_spec()Return name and spec for the next worker
scale([n, memory, cores])Scale cluster to n workers
scale_up([n, memory, cores])Scale cluster to n workers
sync(func, *args[, asynchronous, ...])Call func with args synchronously or asynchronously depending on the calling context
wait_for_workers(n_workers[, timeout])Blocking call to wait for n workers before continuing
close
get_cloud_init
logs
render_cloud_init
render_process_cloud_init
scale_down