Monday, November 30, 2015

Cassandra Nodetool


Introduction to Cassandra Nodetool:

1. Nodetool is the command line utility for managing cassandra cluster.
             /install/bin/nodetool

2. Command to connect to the node other than that you are currently on use         the below command.
           $ bin/nodetool -h 'hostname' -p 'jmx_port' [command] [options]
               > jmx_port is configured in cassandra-env.sh
               > default jmx port is 7199.
          Example:  


3. Nodetool supports over 60 commands including:
         > status
         > info
         > ring 
            Example:
4. Sample outout of nodetool info command.



5.Additional nodetool commands
    















Cassandra



Installing, Configuring and Running  Cassandra locally

1.  Prepare the Operating System
            a)  Install latest Java 7
            b)  Configure JAVA_HOME 
            c)   Install JNA ( Java Native Access) libraries
            d)   Synchronize clocks on each node by using NTP  protocol.
            e)  Disable SWAP      (sudo swapoff -all)
         

 2. Select and install a Cassandra distribution.
         There are three distributions 
            a)  Cassandra Opensource 
            b)  DSE(Datastax Enterprise)
            c)  DSC(Datastax Community)

                Directory Structure after you install Cassandra

   3.Configure Cassandra for the single node 
          Configuration files include
             a) cassandra.yaml
                
             b) cassandra-env.sh
                        
             c)  logback.xml
                      
             d) cassandra-rackdc.properties
             e) cassandra-topology.properties
             f) bin/cassandra-in.sh

4.Start and Stop the cassandra instance.

                a) Starting the instance

                b)  Stopping the instance
                       

            c) System logs



         Summary






Saturday, November 28, 2015

Key Features And Benefits Of Cassandra


 Cassandra provides the following features and benefits.

  1. Massively scalable architecture
  2. Active everywhere design
  3. Linear scalable performance
  4. Continuous availability
  5. Transparent fault detection and recovery
  6. Flexible and dynamic data model
  7. Strong data protection
  8. Tunable data consistency
  9. Multi-data center replication
  10. Data compression
  11. CQL

Tuesday, November 3, 2015

Hive Basics for Begginers #2



11. Loading the multi-delimiter data into hive table
       There are four steps that I follow to load this kind of data    
       a)  Creating a single column table in hive
           hive> create table multi_temp(content String);         


        b) Loading data from local file system to the hive single column table.




       c) Creating the deired table in hive 
     

        d)  Loading the data from single column table to the desired table 
         

12. Loading XML data into hive table.
  We can use the same four step approach that we have used for the multi-deimiter data of which the first three steps are same.
4. Loading data from single column data to the desired table.

   Till now we have executed all the queries performed operations in hive terminal. We can also do this by writing a script and executing it from local terminal.

13. Loading nested XML data into hive table.





14. Creating and Executing Hive Scripts.
    a) creating the hive script
   


     b) Executing the hive script.