Commit a2f55e82 authored by Daniele Venzano's avatar Daniele Venzano

Update the documentation to have consistent naming

parent eee91d87
......@@ -13,6 +13,8 @@ Zoe can use any Docker image, but we provide some for the pre-configured applica
- Docker images: https://github.com/DistributedSystemsGroup/zoe-docker-images
We currently do not provide a Dockerfile as we did in the past. This is only temporary and soon we will have autobuilt images for each Zoe component.
Repository contents
-------------------
......@@ -21,6 +23,7 @@ Repository contents
- `scripts`: Scripts used to test Zoe images outside of Zoe
- `zoe_cmd`: Command-line client
- `zoe_lib`: Client-side library, contains also some modules needed by the observer and the scheduler processes
- `zoe_logger`: Optional Kafka producer for Docker container logs
- `zoe_observer`: The Observer process that monitors Swarm and informs the scheduler of various events
- `zoe_scheduler`: The core of Zoe, the server process that listens for client requests and creates the containers on Swarm
- `zoe_web`: The web client interface
......@@ -35,4 +38,4 @@ Zoe is licensed under the terms of the Apache 2.0 license.
:target: https://requires.io/github/DistributedSystemsGroup/zoe/requirements/?branch=master
:alt: Requirements Status
.. |Travis build| image:: https://travis-ci.org/DistributedSystemsGroup/zoe.svg
:target: https://travis-ci.org/DistributedSystemsGroup/zoe
\ No newline at end of file
:target: https://travis-ci.org/DistributedSystemsGroup/zoe
Architecture
============
Zoe is composed of:
The main Zoe Components are:
* zoe master: the core component that performs application scheduling and talks to Swarm
* zoe observer: listens to events from Swarm and looks for idle resources to free automatically
* zoe-web: the web client interface
* zoe: command-line client
* zoe-scheduler: the core daemon that performs application scheduling and talks to Swarm
* zoe-web: the web service
* zoe-observer: listens to events from Swarm and looks for idle resources to free automatically
The command line client and the web interface are the user-facing components of Zoe, while the scheduler and the observer are the back ends.
The command line client and the web interface are the user-facing components of Zoe, while the master and the observer are the back ends.
The Zoe scheduler is the core component of Zoe and communicates with the clients by using a REST API. It manages users, applications and executions.
Users submit *application descriptions* and can ask for their execution. The scheduler keeps track of available resources and execution requests, and applies a
The Zoe master is the core component of Zoe and communicates with the clients by using a REST API. It manages users, applications and executions.
Users submit *application descriptions* for execution. Inside the Master, a scheduler keeps track of available resources and execution requests, and applies a
scheduling policy to decide which requests should be satisfied as soon as possible and which ones can be deferred for later.
The scheduler talks to Docker Swarm to create and destroy containers and to read monitoring information used to schedule applications.
The master also talks to Docker Swarm to create and destroy containers and to read monitoring information used to schedule applications.
Application descriptions
------------------------
Application descriptions are at the core of Zoe. They are likely to evolve in time to include more information that needs to be passed to the scheduler.
Currently they are composed of a set of generic attributes that apply to the whole application and a list of processes. Each process describes
a container that will be started on the Swarm.
The Zoe commandline client is able to load a description from a JSON file and a number of predefined application descriptions can be exported to be modified
and customized.
Any number of process can be put in a Zoe application description. The format supports complex scenarios mixing any kind of software that
can be run in Docker containers.
Currently they are composed of a set of generic attributes that apply to the whole Zoe Application and a list of Zoe Frameworks. Each Framework is composed by Zoe Services, that describe actual Docker containers. The composition of Frameworks and Services is described by a dependency tree.
These descriptions are strictly linked to the docker images used in the process descriptions, as they specify environment variables and commands to be executed. We successfully used third party images, demonstrating the generality of zoe's approach.
Please note that this documentation refers to the full Aoe Application description that is not yet fully implemented in actual code.
You can use the ``zoe.py pre-app-list`` and ``zoe.py pre-app-export`` commands to export a JSON-formatted application description to use as a template.
......@@ -4,18 +4,23 @@ Zoe config files
================
Zoe config files have a simple format of ``<option name> = <value>``. Dash characters can be use for comments.
Each Zoe process has its own configuration file, as described in the following sections:
Each Zoe component has its own configuration file, as described in the following sections:
zoe-scheduler.conf
------------------
zoe-master.conf
---------------
* ``debug = <true|false>`` : enable or disable debug log output
* ``swarm = zk://zk1:2181,zk2:2181,zk3:2181`` : connection string to the Swarm API endpoint. Can be expressed by a plain http URL or as a zookeeper node list in case Swarm is configured for HA.
* ``private-registry = <address:port>`` : address of the private registry containing Zoe images
* ``state-dir = /var/lib/zoe`` : Directory where all state and other binaries (execution logs) are saved.
* ``zoeadmin-password = changeme`` : Password for the zoeadmin user
* ``container-name-prefix = devel`` : prefix used by this instance of Zoe for all names generated in Swarm. Can be used to have multiple Zoe deployments using the same Swarm (devel and prod, for example)
* ``deployment-name = devel`` : name of this Zoe deployment. Can be used to have multiple Zoe deployments using the same Swarm (devel and prod, for example)
* ``listen-address`` : address Zoe will use to listen for incoming connections to the REST API
* ``listen-port`` : port Zoe will use to listen for incoming connections to the REST API
* ``influxdb-dbname = zoe`` : Name of the InfluxDB database to use for storing metrics
* ``influxdb-url = http://localhost:8086`` : URL of the InfluxDB service (ex. )
* ``influxdb-enable = False`` : Enable metric output toward influxDB
* ``passlib-rounds = 60000`` : Number of hashing rounds for passwords, has a sever performance impact on each API call
* ``gelf-address = udp://1.2.3.4:1234`` : Enable Docker GELF log output to this destination
zoe-observer.conf
......@@ -23,12 +28,27 @@ zoe-observer.conf
* ``debug = <true|false>`` : enable or disable debug log output
* ``swarm = zk://zk1:2181,zk2:2181,zk3:2181`` : connection string to the Swarm API endpoint. Can be expressed by a plain http URL or as a zookeeper node list in case Swarm is configured for HA.
* ``zoeadmin-password = changeme`` : Password for the zoeadmin user
* ``container-name-prefix = devel`` : prefix used by this instance of Zoe for all names generated in Swarm. Can be used to have multiple Zoe deployments using the same Swarm (devel and prod, for example)
* ``scheduler-url = http://<address:port>`` : address of the Zoe scehduler rest API
* ``spark-activity-timeout = <seconds>`` : number of seconds to wait before an inactive Spark cluster is automatically terminated
* ``deployment-name = devel`` : name of this Zoe deployment. Can be used to have multiple Zoe deployments using the same Swarm (devel and prod, for example)
* ``master-url = http://<address:port>`` : address of the Zoe Master REST API
* ``spark-activity-timeout = <seconds>`` : number of seconds to wait before an inactive Spark cluster is automatically terminated, this is done only for guest users
* ``loop-time = 300`` : time in seconds between successive checks for idle applications that can be automatically terminated
zoe-web.conf
------------
* ``zoe-admin-pass = changeme`` : Password for the zoeadmin user
* ``debug = <true|false>`` : enable or disable debug log output
* ``zoeadmin-password = changeme`` : Password for the zoeadmin user
* ``listen-address`` : address Zoe will use to listen for incoming connections to the web interface
* ``listen-port`` : port Zoe will use to listen for incoming connections to the web interface
* ``master-url = http://<address:port>`` : address of the Zoe Master REST API
zoe-logger.conf
---------------
This component is optional.
* ``debug = <true|false>`` : enable or disable debug log output
* ``deployment-name = devel`` : name of this Zoe deployment. Can be used to have multiple Zoe deployments using the same Swarm (devel and prod, for example)
* ``influxdb-dbname = zoe`` : Name of the InfluxDB database to use for storing metrics
* ``influxdb-url = http://localhost:8086`` : URL of the InfluxDB service (ex. )
* ``influxdb-enable = False`` : Enable metric output toward influxDB
* ``kafka-broker = 1.2.3.4:9092``: Address of the Kafka broker to send logs to
......@@ -39,10 +39,23 @@ Contents:
install
config_file
logging
architecture
vision
contributing
A note on terminology
---------------------
We are spending a lot of effort to use consistent naming throughout the documentation, the software, the website and all the other resources associated with Zoe. Check the :ref:`architecture` document for the details, but here is a quick reference:
* Zoe Components: the Zoe processes, the Master, the Observer, the web interface, etc.
* Zoe Applications: a composition of Zoe Frameworks, is the highest-level entry in application descriptions that the use submits to Zoe
* Zoe Frameworks: a composition of Zoe Services, is used to describe re-usable pieces of Zoe Applications, like a Spark cluster
* Zoe Services: one to one with a Docker container, describes a single service/process tree running in an isolated container
Contacts
========
......
......@@ -3,12 +3,13 @@ Installing Zoe
Zoe components:
* scheduler
* observer
* Master
* Observer
* logger
* web client
* command-line client
Zoe is written in Python and uses the ``requirements.txt`` file to list package dependencies.
Zoe is written in Python and uses the ``requirements.txt`` file to list the package dependencies needed for all components of Zoe. Not all of them are needed in all cases, for example you need the ``kazoo`` library only if you use Zookeeper to manage Swarm high availability.
Requirements
------------
......@@ -20,6 +21,7 @@ Zoe is written in Python 3. Development happens on Python 3.4, but we test also
Optional:
* A Docker registry containing Zoe images for faster container startup times
* A logging pipeline able to receive GELF-formatted logs, or a Kafka broker
Swarm/Docker
------------
......@@ -48,7 +50,7 @@ The images used by Zoe are available on the Docker Hub:
* https://hub.docker.com/r/zoerepo/
Since the Docker Hub can be quite slow, we strongly suggest setting up a private registry. The ``build_images.sh`` script in the
Since the Docker Hub can be slow, we strongly suggest setting up a private registry. The ``build_images.sh`` script in the
`zoe-docker-images <https://github.com/DistributedSystemsGroup/zoe-docker-images>`_ repository can help you populate the registry
bypassing the Hub.
......
Container logs
==============
By default Zoe does not involve itself with the output from container processes. The logs can be retrieved with the usual Docker command ``docker logs`` while a container is alive and then they are lost forever.
Using the ``gelf-address`` option of the Zoe Master process, Zoe can configure Docker to send the container outputs to an external destination in GELF format. GELF is the richest format supported by Docker and can be ingested by a number of tools such as Graylog and Logstash. When that option is set all containers created by Zoe will send their output (standard output and standard error) to the destination specified.
Docker is instriucted to add all Zoe-defined tags to the GELF messages, so that they can be aggregate by Zoe Application, Zoe user, etc.
Zoe also provides a Zoe Logger process, in case you prefer to use Kafka in your log pipeline. Each container output will be sent to its own topic, that Kafka will conserve for seven days by default. With Kafka you can also monitor the container output in real-time, for example to debug your container images running in Zoe. In this case GELF is converted to syslog-like format for easier handling
The logger process is very small and simple, you can modify it to suit your needs and convert logs in any format to any destination you prefer.
......@@ -7,7 +7,7 @@ The fundamental idea of Zoe is that a user who wants run data analytics applicat
of RAM a Spark Executor should use, how many cores are available in the system or even how many worker nodes should be used to meet an execution deadline.
Moreover we feel that there is a lack of solutions in the field of private clouds, where resources are not infinite and data layers (data-sets) may be shared between
different users. All the current Open Source solutions we are aware of target the public cloud use case and try to, more or less, mimic what Amazon and other big
different users. All the current Open Source solutions we are aware of target the public cloud use case and try, more or less, to mimic what Amazon and other big
names are doing in their data-centers.
Zoe strives to satisfy the following requirements:
......@@ -17,7 +17,11 @@ Zoe strives to satisfy the following requirements:
* short (a few seconds) reaction times to user requests or other system events
* smart queuing and scheduling of applications when resources are critical
OpenStack Sahara, Mesos and YARN are the projects that, each in its own way, try to solve at least part of our needs.
Kubernetes, OpenStack Sahara, Mesos and YARN are the projects that, each in its own way, try to solve at least part of our needs.
Kubernetes (Borg)
-----------------
Kubernetes is a very complex system, both to deploy and to use. It takes some of the architectural principles from Google Borg and targets datacenters with vast amounts of resources. We feel that while Kubernetes can certainly run analytic services in containers, it does so at a very high complexity cost for smaller setups. Moreover, in our opinion, certain scheduler choices in how preemption is managed do not apply well to environments with a limited set of users and compute resources, causing a less than optimal resource usage.
OpenStack Sahara
----------------
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment