Tuesday, January 16, 2018

Yarn Capacity Scheduler

1. /etc/hadoop/conf/yarn-site.xml                                                                                                                                                                                    yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
2Capacity Scheduler settings (/etc/hadoop/conf/capacity-scheduler.xml on resource manager host)
  
   a) Queues are fundamental unit of scheduling in yarn
               yarn.scheduler.capacity.root.queues = support,engineering,marketing     (parent queues)
                       yarn.scheduler.capacity.support.queues=training,services                (leaf queues)
                       yarn.scheduler.capacity.engineering=development,qa                       (leaf queues)
                       yarn.scheduler.capacity.marketing=sales,advertising                         (leaf queues)
 
             b) Capacity Scheduler ACL's
                        
                         Queue users:
                         yarn.scheduler.capacity.root.acl_submit_application=      (asterisk == all users and groups & empty space == deny all users and groups)
                                  yarn.scheduler.capacity.support.acl_submit_applications= sherlock,pacioli,njella hg_group,cd_group
                                  yarn.scheduler.capacity.engineering.acl_submit_applications=user1,user2,user3 group1,group2,group3
                                  yarn.scheduler.capacity.marketing.acl_submit_applications= user4,user2,user5, group4,group5,group6
               Queue Administers
                         yarn.scheduler.capacity.root.acl_administer_queue=
                                  yarn.scheduler.capacity.support.acl_administer_queue= support_admin_group
                                  yarn.scheduler.capacity.engineering.acl_administer_queue=engineering_admin_group
                                  yarn.scheduler.capacity.marketing.acl_administer_queue=engineering_Admin_grp

   c) Queue Mappings:
               
               Setting Up Queue Mappings
                         yarn.scheduler.capacity.queue-mappings=u:njella:engineering,u:chitra:marketing,g:engineering_group:engineering
                         If group name is same as queue name then
                        yarn.scheduler.capacity.queue-mappings=u:%user:%primary_group                       
                        If user name is same as queue name then
                        yarn.scheduler.capacity.queue-mappings=u:%user:%user                       
               Enable queue mapping override:
                            yarn.scheduler.capacity.queue-mappings-override.enable=true
                         Example:                    
                                  <property>
                                          <name>yarn.scheduler.capacity.queue-mappings</name>
                                          <value>u:njella:engineering, g:webadmins:weblog</value> 
                                  </proprty>
                                            
                                            If user "njella" explicitly submits an application to the marketing queue, the default queue assignment of engineering is overriden and the  application is submitted to the marketing queue.

                     

Thursday, January 4, 2018

Issues



1. Ambari-agent



Sol : Change verify=platform_default   to verify=disable  in /etc/pythoncert-verification.cfg file

  1. sed -i 's/verify=platform_default/verify=disable/' /etc/python/cert-verification.cfg

-------------------------------------------------------------------------------------











Tuesday, October 17, 2017

How to Synchronize the system clock to Network Time Protocol (NTP) in Linux.




1.  If you do not have the ntpd service enabled use the below command to install and enable..

   a) yum install ntp.
   b) systemctl enable ntpd.

 

2. Modify the  /etc/ntp.conf  file as shown in the below image.  If you are using Cent OS 7, please modify the  cat /etc/ntp/step-tickers.

        You can also find the preferred ones for your location at http://www.pool.ntp.org/




3. Start the ntpd service

    systemctl restart ntpd

     





Sunday, October 2, 2016

Configuring Network for CentOS Virtual Machine

1.




2.





4.  echo "192.168.40.10" namenode1.mysandbox.com >> /etc/hosts



5.  echo "nameserver 192.168.40.2">> /etc/resolv.conf


6. [root@localhost ~]# vi /etc/selinux/config




Tuesday, December 1, 2015

Cassandra cqlsh




1. cqlsh is a command line utility for issuing query statemets to cassandra or         altering schema's in cassandra.
                    install/bin/cqlsh

2. Some of the options that you can pass to the cqlsh command are                  


  
3. cqlsh has some commands that are not there in Cassandra Query Language
       Below are some such commands

4. Copying data to or from a specified table and csv file
       

5. Default keyspaces in Cassandra:
             a) system_traces
             b) system
          


6.  cqlsh commands and CQL commands
     






Monday, November 30, 2015

Cassandra Nodetool


Introduction to Cassandra Nodetool:

1. Nodetool is the command line utility for managing cassandra cluster.
             /install/bin/nodetool

2. Command to connect to the node other than that you are currently on use         the below command.
           $ bin/nodetool -h 'hostname' -p 'jmx_port' [command] [options]
               > jmx_port is configured in cassandra-env.sh
               > default jmx port is 7199.
          Example:  


3. Nodetool supports over 60 commands including:
         > status
         > info
         > ring 
            Example:
4. Sample outout of nodetool info command.



5.Additional nodetool commands
    















Cassandra



Installing, Configuring and Running  Cassandra locally

1.  Prepare the Operating System
            a)  Install latest Java 7
            b)  Configure JAVA_HOME 
            c)   Install JNA ( Java Native Access) libraries
            d)   Synchronize clocks on each node by using NTP  protocol.
            e)  Disable SWAP      (sudo swapoff -all)
         

 2. Select and install a Cassandra distribution.
         There are three distributions 
            a)  Cassandra Opensource 
            b)  DSE(Datastax Enterprise)
            c)  DSC(Datastax Community)

                Directory Structure after you install Cassandra

   3.Configure Cassandra for the single node 
          Configuration files include
             a) cassandra.yaml
                
             b) cassandra-env.sh
                        
             c)  logback.xml
                      
             d) cassandra-rackdc.properties
             e) cassandra-topology.properties
             f) bin/cassandra-in.sh

4.Start and Stop the cassandra instance.

                a) Starting the instance

                b)  Stopping the instance
                       

            c) System logs



         Summary