GitHub - tgasla/rl-cloudsimplus: A tool that simplifies training DRL agents for cloud resource management by bridging CloudSim Plus with Gymnasium

Installation

1. Install Docker

https://docs.docker.com/get-docker/

Warning

If you install Docker Desktop for MacOS, make sure you are giving enough memory in your containers by going to Settings... > Resources and increasing the Memory Limit

2. Install Docker Compose

https://docs.docker.com/compose/install/

3. Install Java OpenJDK 21

For Debian, install the openjdk-21-jdk and openjdk-21-jre packages

sudo apt install openjdk-21-jdk openjdk-21-jre

For MacOS, use brew

brew install openjdk@21

4. Set the JAVA_HOME environment variable to the right path

Important

The exact path may vary (distro, arch, etc.)

For Linux

export JAVA_HOME=/usr/lib/jvm/java-21-openjdk-<arch>

For MacOS
- Follow brew instructions (after installing openjdk) or visit here for more details.

Note

This command will make your default Java version 21.

Building the TensorBoard, Gateway, and RL Manager images

make build

Note

It is often useful to rebuild images one at a time, especially when a change is made only in a specific application part. For example, when we change the gateway code, we must rebuild the image before running the application.

make build-gateway

Starting the TensorBoard dashboard

The project consists of three docker images. The gateway and manager images contain the main application and are the docker compose services we need for every experiment we want to run. The TensorBoard image is the UI endpoint and helps us keep track of the experiment's progress. Because we do not want to shut down the visualization dashboard every time we want to stop an experiment, the TensorBoard image is not a docker compose service and can be started as a standalone docker container by using the following command:

make run-tensorboard

Note

The default port of tensorboard has been overridden, so it uses port 80. If you have other processing running on port 80 and you wish to change the port that tensorbaord uses, you can do so by changing this Makefile. You can check that the TensorBoard dashboard is running by visiting http://localhost.

Editing the experiment configuration file

To run an experiment, first rename the file config-template.yml to config.yml and edit it to create the experiment scenario of your choice.

The configuration file is divided into three sections:

global: Global settings that apply to all experiments
common: Shared parameters used by all experiments
experiment_{id}: Specific parameters for individual experiments (e.g., experiment_1, experiment_2, etc.)
The global section controls high-level settings related to logging, GPU usage, and process output.

Key	Type	Description
`attached`	`[true\|false]`	Whether the terminal should attach to the experiment output.
`gpu`	`[true\|false]`	Whether to use GPU during experiments.
`java_log_level`	`[TRACE\|DEBUG\|INFO\|WARNING\|ERROR]`	Logging verbosity level for Java components.
`java_log_destination`	`[none\|stdout\|file\|stdout-file]`	Defines where Java logs are written. - `none`: no logging - `stdout`: logs printed to terminal - `file`: logs written to file - `stdout-file`: both
`junit_output_show`	`[true\|false]`	Whether to print JUnit test results to stdout. Useful for debugging test failures.

The parameters that all experiments have in common are specified under the common section, and those that are unique among the experiments are defined under the experiment_{id} section
- If a parameter is specified in both the common and experiment sections, the common one is ignored, and the experiment one takes effect.
To run multiple experiments in parallel, add as many experiment areas as you want, specifying the corresponding parameters for each experiment.
Each experiment should have a unique experiment id, and each section should be written as experiment_{id}. The first ids should start by 1 and be incremented by 1.

There are three experiment modes: train, transfer, and test. When transfer or test modes are specified, an additional train_model_dir key for an experiment should be defined, with the directory name in which the trained agent model should be used.

Running an experiment

After editing the configuration file, run the following command to start the experiment(s).

make run

CUDA GPU support

There is also support to run the experiments in CUDA GPUs.

You need to have CUDA and nvidia-container-toolkit installed on your system.
Restart the docker daemon if you just downloaded the cuda-container-toolkit.

Warning

If after installing nvidia-container-toolkit you still cannot access GPU in the container, follow the steps below:

Edit the /etc/nvidia-container-runtime/config.toml file changing the no-cgroups to false.
Restart the docker daemon using: sudo systemctl restart docker
Test by running sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

Stopping the application

If you want to stop the application and clear all the dangling containers and volumes, run the following command:

make stop

Acknowledgements

This project uses the CloudSim Plus framework, a full-featured, highly extensible, and easy-to-use Java 17+ framework for modeling and simulating cloud computing infrastructure and services. The source code is available here.
The code was based on the work done by pkoperek in the following projects:

Name		Name	Last commit message	Last commit date
Latest commit History 603 Commits
cloudsimplus-gateway		cloudsimplus-gateway
rl-manager		rl-manager
scripts		scripts
tensorboard		tensorboard
trace_tools		trace_tools
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.template.yml		config.template.yml
config.yml		config.yml
docker-compose.yml		docker-compose.yml
reproduce_plots.md		reproduce_plots.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

1. Install Docker

2. Install Docker Compose

3. Install Java OpenJDK 21

4. Set the JAVA_HOME environment variable to the right path

Building the TensorBoard, Gateway, and RL Manager images

Starting the TensorBoard dashboard

Editing the experiment configuration file

Running an experiment

CUDA GPU support

Stopping the application

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

tgasla/rl-cloudsimplus

Folders and files

Latest commit

History

Repository files navigation

Installation

1. Install Docker

2. Install Docker Compose

3. Install Java OpenJDK 21

4. Set the JAVA_HOME environment variable to the right path

Building the TensorBoard, Gateway, and RL Manager images

Starting the TensorBoard dashboard

Editing the experiment configuration file

Running an experiment

CUDA GPU support

Stopping the application

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages