CVRG OpenTSDB Installation Guide

From CVRG Wiki

Jump to: navigation, search

Contents

CVRG OpenTSDB Installation Guide

The following page is walkthrough of the installation of OpenTSDB on a CentOS virtual machine. This is a variation on the installation described in OpenTSDB's documenation found here: OpenTSDB Installation

System Requirements

OpenTSDB requires the following software:

Software Type Vendor/Name Required Version
Operating System CentOS 6+
Runtime Environment Java 1.6 or later
NoSQL tool HBase 0.92 or later
Graphing tool GNUPlot 4.2 or later

HBase Installation

This is a variation on instructions available at Quick Start - Standalone HBase
You will need sudo access to perform all the operations.

Download, Configure, and Start HBase

  1. Choose a download site from this list of Apache Download Mirrors.
    1. Click on the suggested top link. This will take you to a mirror of HBase Releases.
    2. Click on the folder named stable and then download the binary file that ends in .tar.gz to your local filesystem.
      Be sure to choose the version that corresponds with the version of Hadoop you are likely to use later.
      In most cases, you should choose the file for Hadoop 2, which will be called something like hbase-0.98.3-hadoop2-bin.tar.gz.
      Do not download the file ending in src.tar.gz for now.

  2. Extract the downloaded file, and change to the newly-created directory.

    $ sudo cp hbase-<?eval ${project.version}?>-hadoop2-bin.tar.gz /opt
    $ sudo tar xzvf hbase-<?eval ${project.version}?>-hadoop2-bin.tar.gz
    $ cd hbase-<?eval ${project.version}?>-hadoop2/

  3. For HBase 0.98.5 and later, you are required to set the JAVA_HOME environment variable before starting HBase.
    You can set the variable via your operating system's usual mechanism, but HBase provides a central mechanism, conf/hbase-env.sh.
    Edit this file, uncomment the line starting with JAVA_HOME, and set it to the appropriate location for your operating system.
    The JAVA_HOME variable should be set to a directory which contains the executable file bin/java.
    Most modern Linux operating systems provide a mechanism, such as /usr/bin/alternatives on RHEL or CentOS, for transparently switching between versions of executables such as Java.

    JAVA_HOME=/opt/java

    Note

    These instructions assume that each node of your cluster uses the same configuration. If this is not the case, you may need to set JAVA_HOME separately for each node.

  4. Edit conf/hbase-site.xml, which is the main HBase configuration file.
    At this time, you only need to specify the directory on the local filesystem where HBase and Zookeeper write data. By default, a new directory is created under /tmp.
    Many servers are configured to delete the contents of /tmp upon reboot, so you should store the data elsewhere.
    The following configuration will store HBase's data in the hbase directory, in the /opt directory.
    Paste the <property> tags beneath the <configuration> tags, which should be empty in a new HBase install.

    Example hbase-site.xml for Standalone HBase

    <configuration>
    <property>
    <name>hbase.rootdir</name>
    <value>file:///opt/hbase</value>
    </property>
    <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/opt/zookeeper</value>
    </property>
    </configuration>

    You do not need to create the HBase data directory. HBase will do this for you. If you create the directory, HBase will attempt to do a migration, which is not what you want.

  5. The bin/start-hbase.sh script is provided as a convenient way to start HBase.
    Issue the command, and if all goes well, a message is logged to standard output showing that HBase started successfully.
    You can use the jps command to verify that you have one running process called HMaster.
    In standalone mode HBase runs all daemons within this single JVM, i.e. the HMaster, a single HRegionServer, and the ZooKeeper daemon.

Use HBase For the First Time

  1. Connect to HBase.

    Connect to your running instance of HBase using the hbase shell command, located in the bin/ directory of your HBase install.
    In this example, some usage and version information that is printed when you start HBase Shell has been omitted. The HBase Shell prompt ends with a > character.

    $ ./bin/hbase shell
    hbase(main):001:0>

  2. Display HBase Shell Help Text.

    Type help and press Enter, to display some basic usage information for HBase Shell, as well as several example commands.
    Notice that table names, rows, columns all must be enclosed in quote characters.

  3. Create a table.

    Use the create command to create a new table. You must specify the table name and the ColumnFamily name.

    hbase> create 'test', 'cf'
    0 row(s) in 1.2200 seconds

  4. List Information About your Table

    Use the list command to list information about your table.

    hbase> list 'test'
    TABLE
    test
    1 row(s) in 0.0350 seconds

    => ["test"]

  5. Put data into your table.

    To put data into your table, use the put command.

    hbase> put 'test', 'row1', 'cf:a', 'value1'
    0 row(s) in 0.1770 seconds

    hbase> put 'test', 'row2', 'cf:b', 'value2'
    0 row(s) in 0.0160 seconds

    hbase> put 'test', 'row3', 'cf:c', 'value3'
    0 row(s) in 0.0260 seconds

    Here, we insert three values, one at a time.
    The first insert is at row1, column cf:a, with a value of value1.
    Columns in HBase are comprised of a column family prefix, cf in this example, followed by a colon and then a column qualifier suffix, a in this case.

  6. Scan the table for all data at once.

    One of the ways to get data from HBase is to scan.
    Use the scan command to scan the table for data.
    You can limit your scan, but for now, all data is fetched.

    hbase> scan 'test'
    ROW       COLUMN+CELL
    row1       column=cf:a, timestamp=1403759475114, value=value1
    row2       column=cf:b, timestamp=1403759492807, value=value2
    row3       column=cf:c, timestamp=1403759503155, value=value3
    3 row(s) in 0.0440 seconds

  7. Get a single row of data.

    To get a single row of data at a time, use the get command.

    hbase> get 'test', 'row1'
    COLUMN       CELL
    cf:a              timestamp=1403759475114, value=value1
    1 row(s) in 0.0230 seconds

  8. Disable a table.

    If you want to delete a table or change its settings, as well as in some other situations, you need to disable the table first, using the disable command.
    You can re-enable it using the enable command.

    hbase> disable 'test'
    0 row(s) in 1.6270 seconds

    hbase> enable 'test'
    0 row(s) in 0.4500 seconds

    Disable the table again if you tested the enable command above:
    hbase> disable 'test'
    0 row(s) in 1.6270 seconds

  9. Drop the table.

    To drop (delete) a table, use the drop command.
    hbase> drop 'test'
    0 row(s) in 0.2900 seconds

  10. Exit the HBase Shell.

    To exit the HBase Shell and disconnect from your cluster, use the quit command.
    HBase is still running in the background.

Stop HBase

  1. In the same way that the bin/start-hbase.sh script is provided to conveniently start all HBase daemons, the bin/stop-hbase.sh script stops them.

    $ sudo ./bin/stop-hbase.sh
    stopping hbase....................
    $

  2. After issuing the command, it can take several minutes for the processes to shut down. Use the jps command to be sure that the HMaster and HRegionServer processes are shut down.

GNUPlot Installation

You will need administrative access on the Linux machine to do the following:

  1. To begin, you should check to see if GNUPlot is installed. You do so by executing the following command: rpm -q gnuplot
  2. If GNUPlot is not installed, then you would execute the following: yum install gnuplot
  3. When prompted, enter y to continue the installation.
  4. To verify that the installation worked, execute the following command: rpm -q gnuplot

Compiling OpenTSDB from the source/Starting it up

Compiling From Source

  1. Download the latest version using git clone command or download a release from the site or Github. Then just run the build.sh script. This script helps run all the processes needed to compile OpenTSDB: it runs ./bootstrap (only once, when you first check out the code), followed by ./configure and make. The output of the build process is put into a build folder and JARs required by OpenTSDB will be downloaded.

    $ git clone git://github.com/OpenTSDB/opentsdb.git
    $ cd opentsdb
    $ ./build.sh

  2. If compilation was successfuly, you should have a tsdb jar file in ./build along with a tsdb script. You can now execute command-line tool by invoking ./build/tsdb or you can run make install to install OpenTSDB on your system. Should you ever change your mind, there is also make uninstall, so there are no strings attached. If you need to distribute OpenTSDB to machines without an Internet connection, call ./build.sh dist to wrap the build directory into a tarball that you can then copy to additional machines. If you install OpenTSDB for the first time, you'll need to create the HBase tables using the script located at /usr/share/opentsdb/tools/create_table.sh. Follow the steps below.

Create Tables

  1. If this is the first time that you are running OpenTSDB with your HBase instance, you first need to create the necessary HBase tables. A simple script is provided to create the proper tables with the ability to enable or disable compression. Execute:

    $ env COMPRESSION=NONE HBASE_HOME=path/to/hbase-0.94.X ./src/create_table.sh

    where the COMPRESSION value is either NONE, LZO, GZ or SNAPPY. This will create four tables: tsdb, tsdb-uid, tsdb-tree and tsdb-meta. If you're just evaluating OpenTSDB, don't worry about compression for now. In production and at scale, make sure you use a valid compression library as it will save on storage tremendously.

Start a TSD

  1. OpenTSDB 2.0 works off a configuration file that is shared between the daemon and command line tools. If you compiled from source, copy the ./src/opentsdb.conf file to a proper directory as documented in Configuration and edit the following, required settings:

    tsd.http.cachedir - Path to write temporary files to
    tsd.http.staticroot - Path to the static GUI files found in ./build/staticroot
    tsd.http.request.enable_chunked = true
    tsd.http.request.max_chunk = 8388608
    tsd.storage.hbase.zk_quorum - If HBase and Zookeeper are not running on the same machine, specify the host and port here.

  2. With the config file written, you can start a tsd with the command:

    ./build/tsdb tsd

    Alternatively, you can also use the following commands to create a temporary directory and pass in only command line flags:

    tsdtmp=${TMPDIR-'/tmp'}/tsd - For best performance, make sure
    mkdir -p "$tsdtmp" - your temporary directory uses tmpfs
    ./build/tsdb tsd --port=4242 --staticroot=build/staticroot --cachedir="$tsdtmp" --zkquorum=myhost:2181

  3. At this point you can access the TSD's web interface through http://127.0.0.1:4242 (if it's running on your local machine).

    Note

    The Cache Directory stores temporary files generated when a graph is requested via the built-in GUI. These files should be purged periodically to free up space. OpenTSDB doesn't clean up after itself at this time but there is a script that should be run as a cron at least once a day located at tools/clean_cache.sh.
Personal tools
Project Infrastructures