Jethro Management Package for Ambari

The purpose of this document is to describe the Jethro management package for Apach Ambari.

The overview of Ambari and Jethro's management package for Ambari will be followed by installation, management and monitoring, and uninstall instructions.

Overview

In a nutshell, Ambari is an open-source management platform for provisioning, managing, monitoring and securing Apache Hadoop clusters. Jethro's Ambari service goal is to supply a simple integration with Ambari, so that HDP users will be able to deploy and manage Jethro across their Hadoop clusters. Once integrated, users will have a ce

Service Configs

ntral location for monitoring their Jethro clusters, in terms of security, cluster health and resource utilization.

Jethro Components

The Jethro's components exposed in Ambari within the Jethro Ambari service are:

  1. Jethro Server - The query engine of Jethro. It can be deployed on any host (minimum 1 host). The more hosts this component will be on, the better availability the BI users will get. 
  2. Jethro Maint - The maintenance Linux service of Jethro. It can be deployed on any host (minimum 1 host). One component per Jethro instance, should be active at any time.
  3. Jethro Scheduler - The scheduler Linux service of Jethro. It can be deployed on any host (minimum 1 host). At least one component per Jethro instance, should be active at any time.
  4. Jethro Manager - A web interface served by a Linux service that provides wider management abilities to Jethro's administrator. The web is also accessible beyond Ambari's web. Only a single manager component can be deployed, and any host can be used for that although it is recommended to install it on the Name Node.

During the installation of Jethro service for Ambari, Ambari service advisor will automatically provide a recommendation for a basic deployment of all four components, on the available hosts .

Jethro Metrics

  1. Number of running instances - Will be presented as a gadget by default.
  2. Jethro instances storage size - Will be presented as a gadget by default.
  3. Running maint service (per instance) - Will be used as an alert, in case any Jethro Instance doesn't have one active maint component, on any of hosts this Instance is active on.
  4. Running load scheduler service (per instance) - Will be used as an alert, in case any Jethro Instance doesn't have at least one active scheduler component, on any of hosts this Instance is active on.

Jethro Parameters

The parameters exposed through Ambari, are Jethro's Instances global parameters.

  1. Enable cubes - The default value is '1' (enabled). It represents the Jethro configuration parameter: dynamic.aggregation.auto.generate.enable.
  2. Cube queries server  - The default value is 'auto' (localhost). When using this value, each host will function as a query engine for cubes creation and maintenance. To assign a specific host for it, replace this value with the desired host's details, according to the format: <IP>:<PORT> (without '<>'). The IP should be the IP of the host which will be used as a cube queries server (must have an active Jethro Server component), and the PORT should be the port of the Instance on that host.

Note

Please note:

  1. Editing Jethro's cubes parameters through Ambari, for more than one instance, is not recommended, since the query engine defined in the 'Cube queries server' belongs to a single Instance, and therefore cannot be used for a different Instance needs. The only value that can fit all Instances at the same time, is 'auto'.
  2. Only Instances which were deployed by Ambari, can updated their parameters through Jethro Ambari service.

Working with multiple Jethro instances under Ambari

Jethro's Ambari service, supports well the usage of a single Jethro Instance per host. However, having 2 or more Jethro instances on the same host, creates a limitation:

You cannot deploy more than one Jethro instance, to the same host.

The reason for that is that Ambari represents less levels of objects than Jethro. While Jethro uses a 4-level object model (Application ,Instance, Host, Service), Ambari uses a 3-level object model (Service, Host, Component). This creates a requirement to reduce one Jethro object model from being used in Ambari, which limits the ability to deploy more than one Jethro instance, per host.

Installation

Prerequisites

The integration is supported for:

  • Ambari 2.4 and above
  • Centos/Redhat 6 and above

Please note - If you are using a Centos/Redhat version earlier than 7.0, make sure to update the default links to the Jethro and Jethro Manager RPM files, during the process of adding the Jethro service, to RPM links that supports these OS versions. The default RPM links are for versions of Jethro that supports Centos/Redhat 7.0 and up.

Management Package Installation

  1. Log in your ambari-server with user root, or with a sudoer (requires to add sudo before every command).

  2. Copy/Download the Jethro management pack (mpack) to your ambari-server (jethro-mpack.tar.gz - don't ungz it).

  3. Install the Jethro management pack on your ambari-server:

    ambari-server install-mpack --mpack=<path to the mpack file>
  4. Restart the ambari-server:

    ambari-server restart
  5. Create an Extension Link, to link the installed extension to Amabri according to your HDP stack version:

    curl -u admin:admin -H 'X-Requested-By: ambari' -X POST -d '{"ExtensionLink": {"stack_name": "HDP", "stack_version":"<your stack version>", "extension_name": "JethroExt", "extension_version": "1.0.0"}}' http://localhost:8080/api/v1/links/

    For example, if you are running HDP 2.6.0:

    curl -u admin:admin -H 'X-Requested-By: ambari' -X POST -d '{"ExtensionLink": {"stack_name": "HDP", "stack_version":"2.6", "extension_name": "JethroExt", "extension_version": "1.0.0"}}' http://localhost:8080/api/v1/links/

     *To find the HDP stack version - Open Ambari, click on the Admin tab and then select the Versions tab.

Adding the Jethro Service to Ambari

After the installation of the package completes, you'll be able to add Jethro's service for Amabri's web interface:

  1. Navigate to Ambari's home page. Click 'Actions' (buttom-left on the services bar), and click 'Add Service':

  2. Select the Jethro checkbox from the available services, and click next:

  3. Amabari will display all the components contained in the selected service. You can select a target host for each component.
    It is also possible to add more components (for multiple hosts deployment):

Jethro's Ambari Service Configs

Before every deployment, Jethro's Ambari configuration properties are required to be set.

During the process of adding Jethro's Ambari service, you will be required to define for the first time, at least two mendatory parameters:

  1. A name for the Jethro Instance.
  2. A storage path on HDFS, to be used as the Instance's storage.
    It is assumed that the given HDFS path was already defined prior to every deployment of an instance, and owned by user 'jetrho'. The recomnded path is /user/jethro/instances.

If the instance name provided already exists on the provided storage path, the Instance will be 'attached' to the host. Otherwise, the Instance will be created.

The configs screen will be accessible afterwards also via the 'configs' tab of the Jethro Ambari service.

The rest of the cofig values, are set with default values, and most of it can be edited if desired to. Please note that the management package of Jethro, doesn't pre-contain the Jethro software RPM, so that it could be dynamically added all the time, using the latest and most updated version of Jethro.

Full list of the config parameters:

Property

Config Group

Default Value

Notes

jethro RPMjethro-configlatest Jethro RPMRPM path to download Jerhro software
jethro manager RPMjethro-configlatest Jethro Manager RPMRPM path to download Jethro Manager software
jethro user namejethro-envjethro

The OS user to be used for the installation of Jethro software on each host

Jethro Instance Cache Pathjethro-config/home/jethro/instances_cacheThe path on the host, which the Jethro instance will use for caching
Jethro Instance Storagejethro-config10The maximum size of storage space on the host, which the Jethro instance will use for caching
jethro instance namejethro-config-If the given instance name already exists, Jethro will attach it. Otherwise, it will be created.
jethro instance storagejethro-config-The path to setup the instance storage on. It is assumed the given HDFS path was already defined prior to installing the Jethro service. The recomnded path is /user/jethro/instances.
jethro.kerberos.keytabjethro-envnoneRead only
jethro.kerberos.principaljethro-envnoneRead only
jethro server PID directoryjethro-env/var/run/jethroAuto-configured upon installation, after that it becomes read-only.
jethro manager PID filejethro-env/opt/jethro/jethromng/pm2/pids/JethroManager-0.pidAuto-configured upon installation, after that it becomes read-only.
jethro manager portjethro-config9100Required for the quick link functionality. If the port is changed at jethro manager, the same value needs to be entered here as well.
Enable cubesjethro-global1The actual Instance param name affected: dynamic.aggregation.auto.generate.enable 
Cube queries serverjethro-global"auto"The actual Instance param name affected: dynamic.aggregation.auto.generate.execution.hosts 

Completing the Installation

After a 'Review' step and clickling 'Deploy', the installation proceess will start for the selected components and hosts, and will run in a fully automated manner:

Once installation process completes, the Jethro service will appear on the left side of the screen, on the services bar:


Monitoring

Most of Jethro's monitoring in Ambari is being done using the 'Summary' pane, and by using the services bar.

The services bar can indicate when there is an alert, and encourage you to see it.

The summary pane can show the state of all the Jethro components which were deployed to the hosts, and to allow viewing of alerts and metrics.

The default metrics presented under the summary pane, are:

  1. The Number of active instances
  2. The utilization of Jethro's storage
  3. The total storage being used by Jethro's instances

All of these indicators can help in monitoring the health of the system, it's current availability, and its future storage needs.

Alerts, as described in the Overview section of this document, can be presented in detail if accessed from the summary pane:

Management

Jethro Ambari service can be used to control the amount of Jethro components being deployed/used at any time.

It also allows the user to perform maintenance actions on the Jethro linux services, such as restart/start/stop actions, straigh from the GUI.

By clicking on 'Hosts' from the header menu, and choosing a specific host, you can deploy Jethro components to that host:

Once a component is deployed to a host, you can click on the component name, and control it's state within the host (Started/Stopped/Restarted):

Uninstallation

  1. Select 'Jethro' from the side bar, to reach the summary screen.
  2. For each 'Jethro Maint' component shown on the summary pane:
    1. Click on it to reach the screen that shows all the component that are on that host.
    2. Find the 'Jethro Maint' component, click on the dropdown next to it, and choose 'Stop Jethro Metrics'.
    3. Go back to Jethro's summary pane, and do the same for the next 'Jethro Maint' component, until all of them were treated.
  3. Stop all Jethro's components on all hosts. You can do that by selecting 'STOP' from the Jethro service summery page:
  4. Choose 'Delete Service' from the same menu.
  5. To remove the extension link, log in to your Ambari-machine, and run the following command with the proper parameters:

    curl -u admin:admin -H 'X-Requested-By: ambari' -X DELETE http://<server>:<port>/api/v1/links/<link_id>

    To find the <link_id>, you can run the command 'curl' on the final http link.
    For example:

    $curl http://admin:admin@localhost:8080/api/v1/links/
    
    
    {
      "href" : "http://localhost:8080/api/v1/links/",
      "items" : [
        {
          "href" : "http://localhost:8080/api/v1/links/2",
          "ExtensionLink" : {
            "extension_name" : "JethroExt",
            "extension_version" : "1.0.0",
            "link_id" : 1,
            "stack_name" : "HDP",
            "stack_version" : "2.6"
          }
        }
      ]
    }
    
    
    $curl -u admin:admin -H 'X-Requested-By: ambari' -X DELETE http://localhost:8080/api/v1/links/1
  6. To uninstall the package, run:

    ambari-server uninstall-mpack --mpack-name=jethro-mpack

Known Issues & Limitations

Service Configuration Is Unavailable

Symptoms

After Uninstalling Jethro Ambari service, and reinstallation, no configurations (beside custom configurations) are available under service → configuration section:

Root Cause

The Amabri server fails to clean the custom service data after uninstalling the service, and the custom service data is still stored on Ambari DB.

Workaround

Delete Jethro Ambari service data manually from the DB, by running the following commands:

  1. First, uninstall the Jethro Ambari service from the Ambari server (services → Jethro → stop service → delete service)

  2. Log in to Ambari DB

    psql ambari -U ambari -W -p 5432

    (The default password is 'bigdata').

  3. Delete the Jethro Ambari service records:

    /***** Delete The Service Configurations ******/
    delete from serviceconfigmapping where service_config_id in (select service_config_id from serviceconfig where service_name='JETHRO');
    
    delete from serviceconfig where service_name='JETHRO';
    
    delete from clusterconfig where type_name like '%jethro%';
    
    /***** Delete The Service Components ******/
    delete from hostcomponentdesiredstate where service_name = 'JETHRO';
    
    delete from hostcomponentstate  where service_name = 'JETHRO';
    
    delete from servicecomponentdesiredstate where service_name = 'JETHRO';
    
    
    /***** Delete The Service ******/
    delete from servicedesiredstate where service_name='JETHRO';
    
    delete from clusterservices where service_name='JETHRO';
  4. Exit Ambari CLI (by \q or CTRL+D).

  5. Login as root. then, Restart Ambari server:

    ambari-server restart
  6. Make sure you see the following output at the end of the restart procedure:

Jethro services are still running on hosts after Jethro Ambari service is removed

Symptoms

After succefully deleting Jethro Ambari service from Ambri UI, Jethro and Jethro Manager are still up and running on the deployed hosts:

Root Cause

Ambari doesn't support uninstallation hooks for custom services.

See: https://issues.apache.org/jira/browse/AMBARI-12573

Workaround

Uninstall Jethro and Jethro Manager manually:

/**** Log in to each Jethro host, and run: ****/
service jethro stop
rpm -e jethro

/**** Log in to the Jethro Manager host, and run: ****/
rpm -e jethromng

See Also

Jethro Manager