My Big Data

Thursday, June 11, 2020

HDFS Admin Commands:

[hdfs@sandbox-hdp ~]$ hdfs cacheadmin
Usage: bin/hdfs cacheadmin [COMMAND]
[-addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]]
[-modifyDirective -id <id> [-path <path>] [-force] [-replication <replication>] [-pool <pool-name>] [-ttl <time-to-live>]]
[-listDirectives [-stats] [-path <path>] [-pool <pool>] [-id <id>]]
[-removeDirective <id>]
[-removeDirectives -path <path>]
[-addPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-defaultReplication <defaultReplication>] [-maxTtl <maxTtl>]]
[-modifyPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-defaultReplication <defaultReplication>] [-maxTtl <maxTtl>]]
[-removePool <name>]
[-listPools [-stats] [<name>]]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs crypto
Usage: bin/hdfs crypto [COMMAND]
[-createZone -keyName <keyName> -path <path>]
[-listZones]
[-provisionTrash -path <path>]
[-getFileEncryptionInfo -path <path>]
[-reencryptZone <action> -path <zone>]
[-listReencryptionStatus]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs debug
Usage: hdfs debug <command> [arguments]

These commands are for advanced users only.

Incorrect usages may result in data loss. Use at your own risk.

verifyMeta -meta <metadata-file> [-block <block-file>]
computeMeta -block <block-file> -out <output-metadata-file>
recoverLease -path <path> [-retries <num-retries>]

[hdfs@sandbox-hdp ~]$ hdfs dfsadmin
Usage: hdfs dfsadmin
Note: Administrative commands can only be run as the HDFS superuser.
[-report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]]
[-safemode <enter | leave | get | wait>]
[-saveNamespace [-beforeShutdown]]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-setQuota <quota> <dirname>...<dirname>]
[-clrQuota <dirname>...<dirname>]
[-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
[-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
[-finalizeUpgrade]
[-rollingUpgrade [<query|prepare|finalize>]]
[-upgrade <query | finalize>]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-refresh <host:ipc_port> <key> [arg1..argn]
[-reconfig <namenode|datanode> <host:ipc_port> <start|status|properties>]
[-printTopology]
[-refreshNamenodes datanode_host:ipc_port]
[-getVolumeReport datanode_host:ipc_port]
[-deleteBlockPool datanode_host:ipc_port blockpoolId [force]]
[-setBalancerBandwidth <bandwidth in bytes per second>]
[-getBalancerBandwidth <datanode_host:ipc_port>]
[-fetchImage <local directory>]
[-allowSnapshot <snapshotDir>]
[-disallowSnapshot <snapshotDir>]
[-shutdownDatanode <datanode_host:ipc_port> [upgrade]]
[-evictWriters <datanode_host:ipc_port>]
[-getDatanodeInfo <datanode_host:ipc_port>]
[-metasave filename]
[-triggerBlockReport [-incremental] <datanode_host:ipc_port>]
[-listOpenFiles [-blockingDecommission] [-path <path>]]
[-help [cmd]]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs dfsrouteradmin
Not enough parameters specified
Federation Admin Tools:
[-add <source> <nameservice1, nameservice2, ...> <destination> [-readonly] [-order HASH|LOCAL|RANDOM|HASH_ALL] -owner <owner> -group <group> -mode <mode>]
[-update <source> <nameservice1, nameservice2, ...> <destination> [-readonly] [-order HASH|LOCAL|RANDOM|HASH_ALL] -owner <owner> -group <group> -mode <mode>]
[-rm <source>]
[-ls <path>]
[-setQuota <path> -nsQuota <nsQuota> -ssQuota <quota in bytes or quota size string>]
[-clrQuota <path>]
[-safemode enter | leave | get]
[-nameservice enable | disable <nameservice>]
[-getDisabledNameservices]

[hdfs@sandbox-hdp ~]$ hdfs ec
Usage: bin/hdfs ec [COMMAND]
[-listPolicies]
[-addPolicies -policyFile <file>]
[-getPolicy -path <path>]
[-removePolicy -policy <policy>]
[-setPolicy -path <path> [-policy <policy>] [-replicate]]
[-unsetPolicy -path <path>]
[-listCodecs]
[-enablePolicy -policy <policy>]
[-disablePolicy -policy <policy>]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs fsck
Usage: hdfs fsck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | -upgradedomains]]]] [-includeSnapshots] [-showprogress] [-storagepolicies] [-maintenance] [-blockId <blk_Id>]
<path> start checking from this path
-move move corrupted files to /lost+found
-delete delete corrupted files
-files print out files being checked
-openforwrite print out files opened for write
-includeSnapshots include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
-list-corruptfileblocks print out list of missing blocks and files they belong to
-files -blocks print out block report
-files -blocks -locations print out locations for every block
-files -blocks -racks print out network topology for data-node locations
-files -blocks -replicaDetails print out each replica details
-files -blocks -upgradedomains print out upgrade domains for every block
-storagepolicies print out storage policy summary for the blocks
-maintenance print out maintenance state node details
-showprogress show progress in output. Default is OFF (no progress)
-blockId print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)

Please Note:
1. By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually tagged CORRUPT or HEALTHY depending on their block allocation status
2. Option -includeSnapshots should not be used for comparing stats, should be used only for HEALTH check, as this may contain duplicates if the same file present in both original fs tree and inside snapshots.

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs haadmin
Usage: haadmin [-ns <nameserviceId>]
[-transitionToActive [--forceactive] <serviceId>]
[-transitionToStandby <serviceId>]
[-failover [--forcefence] [--forceactive] <serviceId> <serviceId>]
[-getServiceState <serviceId>]
[-getAllServiceState]
[-checkHealth <serviceId>]
[-help <command>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs jmxget
init: server=localhost;port=;service=NameNode;localVMUrl=null

Domains:
Domain = JMImplementation
Domain = com.sun.management
Domain = java.lang
Domain = java.nio
Domain = java.util.logging

MBeanServer default domain = DefaultDomain

MBean count = 22

Query MBeanServer MBeans:
List of all the available keys:
[hdfs@sandbox-hdp ~]$
[hdfs@sandbox-hdp ~]$
[hdfs@sandbox-hdp ~]$
[hdfs@sandbox-hdp ~]$ hdfs oev
Usage: bin/hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE
Offline edits viewer
Parse a Hadoop edits log file INPUT_FILE and save results
in OUTPUT_FILE.
Required command line arguments:
-i,--inputFile <arg> edits file to process, xml (case
insensitive) extension means XML format,
any other filename means binary format.
XML/Binary format input file is not allowed
to be processed by the same type processor.
-o,--outputFile <arg> Name of output file. If the specified
file exists, it will be overwritten,
format of the file is determined
by -p option

Optional command line arguments:
-p,--processor <arg> Select which type of processor to apply
against image file, currently supported
processors are: binary (native binary format
that Hadoop uses), xml (default, XML
format), stats (prints statistics about
edits file)
-h,--help Display usage information and exit
-f,--fix-txids Renumber the transaction IDs in the input,
so that there are no gaps or invalid
transaction IDs.
-r,--recover When reading binary edit logs, use recovery
mode. This will give you the chance to skip
corrupt parts of the edit log.
-v,--verbose More verbose output, prints the input and
output filenames, for processors that write
to a file, also output to screen. On large
image files this will dramatically increase
processing time (default is false).

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs oiv
Usage: bin/hdfs oiv [OPTIONS] -i INPUTFILE -o OUTPUTFILE
Offline Image Viewer
View a Hadoop fsimage INPUTFILE using the specified PROCESSOR,
saving the results in OUTPUTFILE.

The oiv utility will attempt to parse correctly formed image files
and will abort fail with mal-formed image files.

The tool works offline and does not require a running cluster in
order to process an image file.

The following image processors are available:
* XML: This processor creates an XML document with all elements of
the fsimage enumerated, suitable for further analysis by XML
tools.
* ReverseXML: This processor takes an XML file and creates a
binary fsimage containing the same elements.
* FileDistribution: This processor analyzes the file size
distribution in the image.
-maxSize specifies the range [0, maxSize] of file sizes to be
analyzed (128GB by default).
-step defines the granularity of the distribution. (2MB by default)
-format formats the output result in a human-readable fashion
rather than a number of bytes. (false by default)
* Web: Run a viewer to expose read-only WebHDFS API.
-addr specifies the address to listen. (localhost:5978 by default)
It does not support secure mode nor HTTPS.
* Delimited (experimental): Generate a text file with all of the elements common
to both inodes and inodes-under-construction, separated by a
delimiter. The default delimiter is \t, though this may be
changed via the -delimiter argument.

Required command line arguments:
-i,--inputFile <arg> FSImage or XML file to process.

Optional command line arguments:
-o,--outputFile <arg> Name of output file. If the specified
file exists, it will be overwritten.
(output to stdout by default)
If the input file was an XML file, we
will also create an <outputFile>.md5 file.
-p,--processor <arg> Select which type of processor to apply
against image file. (XML|FileDistribution|
ReverseXML|Web|Delimited)
The default is Web.
-delimiter <arg> Delimiting string to use with Delimited processor.
-t,--temp <arg> Use temporary dir to cache intermediate result to generate
Delimited outputs. If not set, Delimited processor constructs
the namespace in memory before outputting text.
-h,--help Display usage information and exit

[hdfs@sandbox-hdp ~]$ hdfs oiv_legacy
Usage: bin/hdfs oiv_legacy [OPTIONS] -i INPUTFILE -o OUTPUTFILE
Offline Image Viewer
View a Hadoop fsimage INPUTFILE using the specified PROCESSOR,
saving the results in OUTPUTFILE.

The oiv utility will attempt to parse correctly formed image files
and will abort fail with mal-formed image files.

The tool works offline and does not require a running cluster in
order to process an image file.

The following image processors are available:
* Ls: The default image processor generates an lsr-style listing
of the files in the namespace, with the same fields in the same
order. Note that in order to correctly determine file sizes,
this formatter cannot skip blocks and will override the
-skipBlocks option.
* Indented: This processor enumerates over all of the elements in
the fsimage file, using levels of indentation to delineate
sections within the file.
* Delimited: Generate a text file with all of the elements common
to both inodes and inodes-under-construction, separated by a
delimiter. The default delimiter is , though this may be
changed via the -delimiter argument. This processor also overrides
the -skipBlocks option for the same reason as the Ls processor
* XML: This processor creates an XML document with all elements of
the fsimage enumerated, suitable for further analysis by XML
tools.
* FileDistribution: This processor analyzes the file size
distribution in the image.
-maxSize specifies the range [0, maxSize] of file sizes to be
analyzed (128GB by default).
-step defines the granularity of the distribution. (2MB by default)
-format formats the output result in a human-readable fashion
rather than a number of bytes. (false by default)
* NameDistribution: This processor analyzes the file names
in the image and prints total number of file names and how frequently
file names are reused.

Required command line arguments:
-i,--inputFile <arg> FSImage file to process.
-o,--outputFile <arg> Name of output file. If the specified
file exists, it will be overwritten.

Optional command line arguments:
-p,--processor <arg> Select which type of processor to apply
against image file. (Ls|XML|Delimited|Indented|FileDistribution|NameDistribution).
-h,--help Display usage information and exit
-printToScreen For processors that write to a file, also
output to screen. On large image files this
will dramatically increase processing time.
-skipBlocks Skip inodes' blocks information. May
significantly decrease output.
(default = false).
-delimiter <arg> Delimiting string to use with Delimited processor

[hdfs@sandbox-hdp ~]$ hdfs storagepolicies
Usage: bin/hdfs storagepolicies [COMMAND]
[-listPolicies]
[-setStoragePolicy -path <path> -policy <policy>]
[-getStoragePolicy -path <path>]
[-unsetStoragePolicy -path <path>]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$

HDFS Commands

[hdfs@sandbox-hdp ~]$ hdfs
Usage: hdfs [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]

OPTIONS is none or any of:

--buildpaths attempt to add class files from build tree
--config dir Hadoop config directory
--daemon (start|status|stop) operate on a daemon
--debug turn on shell script debug mode
--help usage information
--hostnames list[,of,host,names] hosts to use in worker mode
--hosts filename list of hosts to use in worker mode
--loglevel level set the log4j level for this command
--workers turn on worker mode

SUBCOMMAND is one of:

Admin Commands:

cacheadmin configure the HDFS cache
crypto configure HDFS encryption zones
debug run a Debug Admin to execute HDFS debug commands
dfsadmin run a DFS admin client
dfsrouteradmin manage Router-based federation
ec run a HDFS ErasureCoding CLI
fsck run a DFS filesystem checking utility
haadmin run a DFS HA admin client
jmxget get JMX exported values from NameNode or DataNode.
oev apply the offline edits viewer to an edits file
oiv apply the offline fsimage viewer to an fsimage
oiv_legacy apply the offline fsimage viewer to a legacy fsimage
storagepolicies list/get/set block storage policies

Client Commands:

classpath prints the class path needed to get the hadoop jar and the required libraries
dfs run a filesystem command on the file system
envvars display computed Hadoop environment variables
fetchdt fetch a delegation token from the NameNode
getconf get config values from configuration
groups get the groups which users belong to
lsSnapshottableDir list all snapshottable dirs owned by the current user
snapshotDiff diff two snapshots of a directory or diff the current directory contents with a snapshot
version print the version

Daemon Commands:

balancer run a cluster balancing utility
datanode run a DFS datanode
dfsrouter run the DFS router
diskbalancer Distributes data evenly among disks on a given node
httpfs run HttpFS server, the HDFS HTTP Gateway
journalnode run the DFS journalnode
mover run a utility to move block replicas across storage types
namenode run the DFS namenode
nfs3 run an NFS version 3 gateway
portmap run a portmap service
secondarynamenode run the DFS secondary namenode
zkfc run the ZK Failover Controller daemon

SUBCOMMAND may print help when invoked w/o parameters or with -h.
[hdfs@sandbox-hdp ~]$

Admin Commands:

Client Commands:

Daemon Commands:

Ambari Commands

Ambari Status	/sbin/service ambari-server status
Start Ambari	/sbin/service ambari-server start
Stop Amabri	/sbin/service ambari-server stop
Reset ambari admin pwd	ambari-admin-password-reset
Ambari ldap sync	ambari-server sync-ldap --users users.txt --groups groups.txt
ambari-server --version	Ambari version
ambari-server --hash	Ambari server hash value
Ambari-server backup	To take backup of ambari settings

Decommission a data node in HDP

From Ambari :

Manually :
[root@namenode1 conf]# pwd
/etc/hadoop/conf
[root@namenode1 conf]# ll | grep dfs.exclude
-rw-r--r--. 1 hdfs hadoop 1 Jan 12 15:32 dfs.exclude
[root@namenode1 conf]# cat dfs.exclude

[root@namenode1 conf]# vi dfs.exclude
[root@namenode1 conf]# cat dfs.exclude
edgenode.hdp.cn
[root@namenode1 conf]# su - hdfs
Last login: Sun Jan 12 15:33:08 EST 2020
[hdfs@namenode1 ~]$ hdfs dfsadmin -refreshNodes
Refresh nodes successful

[hdfs@namenode1 ~]$

Friday, February 2, 2018

Disable Resource Manager HA

Stop YARN and Zookeeper services from ambari
From Ambari-server :

                                           /var/lib/ambari-server/resources/scripts/configs.py --user=admin --
   password=admin --host=edgenode.hdp.cn --cluster=hdpdev --
   action=get --config-      type=yarn-site -f yarn-site.json

From the below change the first property to value "false" and remove the other properties from yarn-site.json

a) "yarn.resourcemanager.ha.enabled": "false",

b) "yarn.resourcemanager.ha.rm-ids": "rm1,rm2",

c) "yarn.resourcemanager.hostname.rm1": "datanode1.hdp.cn",

d) "yarn.resourcemanager.hostname.rm2": "edgenode.hdp.cn",

e) "yarn.resourcemanager.webapp.address.rm1": "datanode1.hdp.cn:8088",

f) "yarn.resourcemanager.webapp.address.rm2": "edgenode.hdp.cn:8088",

g) "yarn.resourcemanager.webapp.https.address.rm1": "datanode1.hdp.cn:8090",

h) "yarn.resourcemanager.webapp.https.address.rm2": "edgenode.hdp.cn:8090",

i) "yarn.resourcemanager.cluster-id": "yarn-cluster",

j) "yarn.resourcemanager.ha.automatic-failover.zk-base-path": "/yarn-leader-election",

Set the below properties to existing resource manager

a) "yarn.resourcemanager.hostname":

b) "yarn.resourcemanager.admin.address":

c) "yarn.resourcemanager.webapp.address":

d)"yarn.resourcemanager.resource-tracker.address":

e)"yarn.resourcemanager.scheduler.address":

f)"yarn.resourcemanager.webapp.https.address":

g)"yarn.timeline-service.webapp.address":

h)"yarn.timeline-service.webapp.https.address":

i)"yarn.timeline-service.address":

j)"yarn.log.server.url":

Copy the yarn-site.json file back to the ambari-server and run the below command to set the changes made

/var/lib/ambari-server/resources/scripts/configs.py --user=admin --password=admin -- host=edgenode.hdp.cn --cluster=hdpdev --action=get --config-type=yarn-site

Delete the Resource Manager host comonent

   curl --user admin:admin -i -H "X-Requested-By: ambari" -X DELETE
  http://edgenode.hdp.cn:8080/api/v1/clusters/hdpdev/hosts/edgenode.hdp.cn/
host_components/RESOURCEMANAGER

Start zookeeper service from Ambari.

On zookeeper clients run the below command to change the znode permissions

/usr/hdp/current/zookeeper-client/bin/zkCli.sh getAcl /rmstore/ZKRMStateRoot

/usr/hdp/current/zookeeper-client/bin/zkCli.sh setAcl /rmstore/ZKRMStateRoot world:anyone:rwcda

From Ambari UI, restart zookeeper service and YARN service

Tuesday, January 16, 2018

Yarn Capacity Scheduler

1. /etc/hadoop/conf/yarn-site.xml yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

2. Capacity Scheduler settings (/etc/hadoop/conf/capacity-scheduler.xml on resource manager host)

a) Queues are fundamental unit of scheduling in yarn

yarn.scheduler.capacity.root.queues = support,engineering,marketing (parent queues)

yarn.scheduler.capacity.support.queues=training,services (leaf queues)

yarn.scheduler.capacity.engineering=development,qa (leaf queues)

yarn.scheduler.capacity.marketing=sales,advertising (leaf queues)

b) Capacity Scheduler ACL's

Queue users:

yarn.scheduler.capacity.root.acl_submit_application= (asterisk == all users and groups & empty space == deny all users and groups)

yarn.scheduler.capacity.support.acl_submit_applications= sherlock,pacioli,njella hg_group,cd_group

yarn.scheduler.capacity.engineering.acl_submit_applications=user1,user2,user3 group1,group2,group3

yarn.scheduler.capacity.marketing.acl_submit_applications= user4,user2,user5, group4,group5,group6

Queue Administers

yarn.scheduler.capacity.root.acl_administer_queue=

yarn.scheduler.capacity.support.acl_administer_queue= support_admin_group

yarn.scheduler.capacity.engineering.acl_administer_queue=engineering_admin_group

yarn.scheduler.capacity.marketing.acl_administer_queue=engineering_Admin_grp

c) Queue Mappings:

Setting Up Queue Mappings

yarn.scheduler.capacity.queue-mappings=u:njella:engineering,u:chitra:marketing,g:engineering_group:engineering

If group name is same as queue name then

yarn.scheduler.capacity.queue-mappings=u:%user:%primary_group

If user name is same as queue name then

yarn.scheduler.capacity.queue-mappings=u:%user:%user

Enable queue mapping override:

yarn.scheduler.capacity.queue-mappings-override.enable=true

Example:

<name>yarn.scheduler.capacity.queue-mappings</name>

<value>u:njella:engineering, g:webadmins:weblog</value>

</proprty>

If user "njella" explicitly submits an application to the marketing queue, the default queue assignment of engineering is overriden and the application is submitted to the marketing queue.

Thursday, January 4, 2018

Issues

1. Ambari-agent

Sol : Change verify=platform_default to verify=disable in /etc/pythoncert-verification.cfg file

sed -i 's/verify=platform_default/verify=disable/' /etc/python/cert-verification.cfg

-------------------------------------------------------------------------------------