Commit b3a5fc0c authored by Daniele Venzano's avatar Daniele Venzano

Documentation updates

parent 947fb0af
...@@ -57,6 +57,7 @@ images: ...@@ -57,6 +57,7 @@ images:
script: script:
- docker build --pull -t zoerepo/${ZOE_TEST_IMAGE} -f Dockerfile.test . - docker build --pull -t zoerepo/${ZOE_TEST_IMAGE} -f Dockerfile.test .
- docker push zoerepo/${ZOE_TEST_IMAGE} - docker push zoerepo/${ZOE_TEST_IMAGE}
- docker rm -f nginx0-1-integration_test || true
api-test: api-test:
stage: integration-test stage: integration-test
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
How to build a ZApp How to build a ZApp
=================== ===================
This tutorial will help you customize a Zoe Application starting from the `Tensorflow ZApp <https://gitlab.eurecom.fr/zoe/zapp-tensorflow>`_. At the end of the tutorial you will be able to customize existing ZApps, but you will also have understood the tools and process necessary to build new ZApps from scratch. This tutorial will help you customize a Zoe Application starting from the `PyDataSci ZApp <https://gitlab.eurecom.fr/zoe/pydatasci>`_. At the end of the tutorial you will be able to customize existing ZApps, but you will also have understood the tools and process necessary to build new ZApps from scratch.
To understand this tutorial you need: To understand this tutorial you need:
...@@ -17,7 +17,7 @@ A ZApp repository contains a number of well-known files, that are used to automa ...@@ -17,7 +17,7 @@ A ZApp repository contains a number of well-known files, that are used to automa
* ``root`` * ``root``
* ``docker/`` : directory containing Docker image sources (Dockerfiles and associated files) * ``<image name>/`` : directory containing Docker image sources (Dockerfiles and associated files)
* ``README-devel.md`` : documentation for the ZApp developer (optional) * ``README-devel.md`` : documentation for the ZApp developer (optional)
* ``README.md`` : documentation for the ZApp user * ``README.md`` : documentation for the ZApp user
* ``zapp.json`` : JSON ZApp description * ``zapp.json`` : JSON ZApp description
...@@ -29,24 +29,37 @@ A ZApp is composed by two main elements: ...@@ -29,24 +29,37 @@ A ZApp is composed by two main elements:
* a container image: the format depends on the container back-end in use, currently Docker is used for Zoe * a container image: the format depends on the container back-end in use, currently Docker is used for Zoe
* a JSON description: the magic ingredient that makes Zoe work * a JSON description: the magic ingredient that makes Zoe work
The JSON description contains all the information needed to start the containers that make up the application. Apart from some metadata, it contains a list of ``services``. Each service describes one or more (almost) identical containers. Please note that Zoe does not replicate services for fault tolerance, but to increase parallelism and performance (think in terms of additional Spark workers, for example). The JSON description contains all the information needed to start the containers that make up the application. Apart from some metadata, it contains a list of ``services``. Each service describes one or more containers. Please note that Zoe does not replicate services for fault tolerance, but to increase parallelism and performance (think in terms of additional Spark workers, for example).
The ZApp format is versioned. Zoe checks the version field as first thing to make sure it can understand the description. This tutorial is based on version 3 of this format. The ZApp format is versioned. Zoe checks the version field as first thing to make sure it can understand the description. This tutorial is based on version 3 of this format.
The Tensorflow ZApp The PyDataSci ZApp
------------------- ------------------
Clone the `Tensorflow ZApp <https://gitlab.eurecom.fr/zoe/zapp-tensorflow>`_ repository. Clone the `PyDataSci ZApp <https://gitlab.eurecom.fr/zoe/pydaatsci>`_ repository.
The ZApp uses the standard Tensorflow image released by google. The image contains Python, the Tensorflow library and a Jupyter Notebook. The repository actually contains two ZApps, a normal one and a GPU-enabled one. Both contain the same set of libraries, the only difference being in the NVidia drivers and associated libraries.
The ``tf-google.json`` file contains the JSON description of the ZApp. The format of this file is described in the :ref:`zapp_format` document. The ZApps use custom images for Jupyter Notebooks containing many data science related libraries, including Tensorflow and PyTorch.
This ZApp has a single service and is a very good example of how to use a pre-existing image on the Docker Hub for execution in Zoe. The ``gen_json.py`` script generates the two JSON descriptions for Zoe. The format of these files is described in the :ref:`zapp_format` document. Each description contains a single service, the one running the Notebook.
The ``image`` field points to an image name that the Zoe back-end is able to understand. Managing Docker images is outside the scope of Zoe: ideally you have in-place, in your cluster, a system that distributes the images on all the nodes for fast ZApp start-up times and that keeps them updated, to make sure new versions with bug fixes are made automatically available. The ``.gitlab.yml`` file contains the GitLab CI description that we use at Eurecom to automatically deploy new versions of the ZApp in our cluster. In the description, The ``image`` field points to an image name that the Zoe back-end is able to understand. Managing Docker images is outside the scope of Zoe: ideally you have in-place, in your cluster, a system that distributes the images on all the nodes for fast ZApp start-up times and that keeps them updated, to make sure new versions with bug fixes are made automatically available. The ``.gitlab.yml`` file contains the GitLab CI description that we use at Eurecom to automatically deploy new versions of the ZApp in our cluster.
The user is also able to override the ``command``: this way the Notebook is not started and the user command is executed instead, effectively transforming the ZApp into a batch one able to run non-interactive scripts. The user is also able to override the ``command``: this way the Notebook is not started and the user command is executed instead, effectively transforming the ZApp into a batch one able to run non-interactive scripts.
The ``manifest.json`` file describes the ZApp in terms of the ZApp Shop in the Zoe web interface. It contains the logo and usage instructions file names and options that are presented to the user when she wants to start the ZApp. The ``manifest.json`` file describes the ZApp in terms of the ZApp Shop in the Zoe web interface. It contains references to the logo file, documentation to show on the web page and user-visible parameters that are presented to the user when she wants to start the ZApp.
The manifest and the ZApp Shop are documented in the :ref:`install` document. The manifest and the ZApp Shop are documented in the :ref:`install` document.
Adding a library
----------------
To add a new library to the PyDataSci ZApps, you need to open the ``Dockerfile`` in each of the two directories (``pydatasci`` and ``pydatasci-gpu``). Always reproduce the changes in the two files to keep the two versions consistent.
In the Dockerfile you will find a list of libraries installed via ``pip`` and below a list of packages installed via ``apt-get install``. Add your library to the right place and save your changes.
To build your image you need to call Docker. Depending on how your Zoe deployment has been done this could be more or less automated. To do it by hand, you need to run the following command::
docker build -t pydatasci(-gpu):test pydatasci(-gpu)/
The build process needs to have the Zoe base images already built and available in the system. These can be found in `this repository <https://gitlab.eurecom.fr/zoe-apps/base-images>`_.
...@@ -142,6 +142,8 @@ The Elastic scheduler together with the DockerEngine back-end will behave in the ...@@ -142,6 +142,8 @@ The Elastic scheduler together with the DockerEngine back-end will behave in the
* for memory, a soft limit will be set at the "min" resource level and an hard limit to the amount set in the ``max-memory-limit`` option. See Docker documentation about the exact definitions of soft and hard limits * for memory, a soft limit will be set at the "min" resource level and an hard limit to the amount set in the ``max-memory-limit`` option. See Docker documentation about the exact definitions of soft and hard limits
* for cores, they will be allocated dynamically and automatically: a service that has cores.min set to 4 will have at least cores. If more are available on the node it is running on, more will be given * for cores, they will be allocated dynamically and automatically: a service that has cores.min set to 4 will have at least cores. If more are available on the node it is running on, more will be given
Additionally, an optional parameter for shared memory can be specified. This does not have a minimum and maximum, but it is a simple value in bytes. The key is ``shm_size`` and is equivalent to the Docker option ``--shm-size``. It may be useful to tune for GPU applications. The parameter is understood only by the DockerEngine back-end.
startup_order startup_order
^^^^^^^^^^^^^ ^^^^^^^^^^^^^
...@@ -170,6 +172,11 @@ array of strings ...@@ -170,6 +172,11 @@ array of strings
This entry is optional. Labels will be used by the scheduler to take placement decisions for services. Services that have labels "ssd" and "gpu" will be placed only on hosts declared with both labels. If no hosts with "ssd" and "gpu" are available, the ZApp is left in the queue. This entry is optional. Labels will be used by the scheduler to take placement decisions for services. Services that have labels "ssd" and "gpu" will be placed only on hosts declared with both labels. If no hosts with "ssd" and "gpu" are available, the ZApp is left in the queue.
network
^^^^^^^
Name of the network to connect this service (containers) to. This parameter is optional and defaults to empty, meaning attach to the default network. In some cases it may be interesting to connect an entire ZApp or part of it to a different virtual network, specified via the back-end.
ports ports
^^^^^ ^^^^^
...@@ -210,6 +217,13 @@ number ...@@ -210,6 +217,13 @@ number
The port number where the service is listening for connections. The external (user-visible) port number will be chosen by the back-end. The port number where the service is listening for connections. The external (user-visible) port number will be chosen by the back-end.
proxy
^^^^^
boolean
Whether to tell the reverse proxy to dynamically generate an external URL for this endpoint. The parameter is optional and defaults to false.
Example Example
------- -------
.. code-block:: json .. code-block:: json
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment