Getting started --------------- Architecture overview ~~~~~~~~~~~~~~~~~~~~~ Gnocchi consists of several services: a HTTP REST API (see :doc:`rest`), an optional statsd-compatible daemon (see :doc:`statsd`), and an asynchronous processing daemon (named `gnocchi-metricd`). Data is received via the HTTP REST API or statsd daemon. `gnocchi-metricd` performs operations (statistics computing, |metric| cleanup, etc...) on the received data in the background. .. image:: _static/architecture.svg :align: center :width: 95% :alt: Gnocchi architecture .. image source: https://docs.google.com/drawings/d/1aHV86TPNFt7FlCLEjsTvV9FWoFYxXCaQOzfg7NdXVwM/edit?usp=sharing All those services are stateless and therefore horizontally scalable. Contrary to many time series databases, there is no limit on the number of `gnocchi-metricd` daemons or `gnocchi-api` endpoints that you can run. If your load starts to increase, you just need to spawn more daemons to handle the flow of new requests. The same applies if you want to handle high-availability scenarios: just start more Gnocchi daemons on independent servers. As you can see on the architecture diagram above, there are three external components that Gnocchi needs to work correctly: - An incoming measure storage - An aggregated metric storage - An index Those three parts are provided by drivers. Gnocchi is entirely pluggable and offer different options for those services. Incoming and storage drivers ++++++++++++++++++++++++++++ Gnocchi can leverage different storage systems for its incoming |measures| and aggregated |metrics|, such as: * File (default) * `Ceph`_ (preferred) * `OpenStack Swift`_ * `Amazon S3`_ * `Redis`_ Depending on the size of your architecture, using the file driver and storing your data on a disk might be enough. If you need to scale the number of server with the file driver, you can export and share the data via NFS among all Gnocchi processes. Ultimately, the S3, Ceph, and Swift drivers are more scalable storage options. Ceph also offers better consistency, and hence is the recommended driver. A typical recommendation for medium to large scale deployment is to use `Redis`_ as an incoming measure storage and `Ceph`_ as an aggregate storage. .. _`OpenStack Swift`: http://docs.openstack.org/developer/swift/ .. _`Ceph`: https://ceph.com .. _`Amazon S3`: https://aws.amazon.com/s3/ .. _`Redis`: https://redis.io Indexer driver ++++++++++++++ You also need a database to index the resources and metrics that Gnocchi will handle. The supported drivers are: * `PostgreSQL`_ (preferred) * `MySQL`_ (at least version 5.6.4) The *indexer* is responsible for storing the index of all |resources|, |archive policies| and |metrics|, along with their definitions, types and properties. The indexer is also responsible for linking |resources| with |metrics| and the relationships of |resources|.. .. _PostgreSQL: http://postgresql.org .. _MySQL: http://mysql.org Understanding aggregation ~~~~~~~~~~~~~~~~~~~~~~~~~ The way data points are aggregated is configurable on a per-metric basis, using an archive policy. An archive policy defines which aggregations to compute and how many aggregates to keep. Gnocchi supports a variety of aggregation methods, such as minimum, maximum, average, Nth percentile, standard deviation, etc. Those aggregations are computed over a period of time (called granularity) and are kept for a defined timespan. Gnocchi uses three different back-ends for storing data: one for storing new incoming |measures| (the *incoming* driver), one for storing the |time series| |aggregates| (the *storage* driver) and one for indexing the data (the *index* driver). By default, the *incoming* driver is configured to use the same value as the *storage* driver. .. include:: include/term-substitution.rst