My Big Data: HDFS Admin Commands:

[hdfs@sandbox-hdp ~]$ hdfs cacheadmin
Usage: bin/hdfs cacheadmin [COMMAND]
[-addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]]
[-modifyDirective -id <id> [-path <path>] [-force] [-replication <replication>] [-pool <pool-name>] [-ttl <time-to-live>]]
[-listDirectives [-stats] [-path <path>] [-pool <pool>] [-id <id>]]
[-removeDirective <id>]
[-removeDirectives -path <path>]
[-addPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-defaultReplication <defaultReplication>] [-maxTtl <maxTtl>]]
[-modifyPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-defaultReplication <defaultReplication>] [-maxTtl <maxTtl>]]
[-removePool <name>]
[-listPools [-stats] [<name>]]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs crypto
Usage: bin/hdfs crypto [COMMAND]
[-createZone -keyName <keyName> -path <path>]
[-listZones]
[-provisionTrash -path <path>]
[-getFileEncryptionInfo -path <path>]
[-reencryptZone <action> -path <zone>]
[-listReencryptionStatus]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs debug
Usage: hdfs debug <command> [arguments]

These commands are for advanced users only.

Incorrect usages may result in data loss. Use at your own risk.

verifyMeta -meta <metadata-file> [-block <block-file>]
computeMeta -block <block-file> -out <output-metadata-file>
recoverLease -path <path> [-retries <num-retries>]

[hdfs@sandbox-hdp ~]$ hdfs dfsadmin
Usage: hdfs dfsadmin
Note: Administrative commands can only be run as the HDFS superuser.
[-report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]]
[-safemode <enter | leave | get | wait>]
[-saveNamespace [-beforeShutdown]]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-setQuota <quota> <dirname>...<dirname>]
[-clrQuota <dirname>...<dirname>]
[-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
[-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
[-finalizeUpgrade]
[-rollingUpgrade [<query|prepare|finalize>]]
[-upgrade <query | finalize>]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-refresh <host:ipc_port> <key> [arg1..argn]
[-reconfig <namenode|datanode> <host:ipc_port> <start|status|properties>]
[-printTopology]
[-refreshNamenodes datanode_host:ipc_port]
[-getVolumeReport datanode_host:ipc_port]
[-deleteBlockPool datanode_host:ipc_port blockpoolId [force]]
[-setBalancerBandwidth <bandwidth in bytes per second>]
[-getBalancerBandwidth <datanode_host:ipc_port>]
[-fetchImage <local directory>]
[-allowSnapshot <snapshotDir>]
[-disallowSnapshot <snapshotDir>]
[-shutdownDatanode <datanode_host:ipc_port> [upgrade]]
[-evictWriters <datanode_host:ipc_port>]
[-getDatanodeInfo <datanode_host:ipc_port>]
[-metasave filename]
[-triggerBlockReport [-incremental] <datanode_host:ipc_port>]
[-listOpenFiles [-blockingDecommission] [-path <path>]]
[-help [cmd]]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs dfsrouteradmin
Not enough parameters specified
Federation Admin Tools:
[-add <source> <nameservice1, nameservice2, ...> <destination> [-readonly] [-order HASH|LOCAL|RANDOM|HASH_ALL] -owner <owner> -group <group> -mode <mode>]
[-update <source> <nameservice1, nameservice2, ...> <destination> [-readonly] [-order HASH|LOCAL|RANDOM|HASH_ALL] -owner <owner> -group <group> -mode <mode>]
[-rm <source>]
[-ls <path>]
[-setQuota <path> -nsQuota <nsQuota> -ssQuota <quota in bytes or quota size string>]
[-clrQuota <path>]
[-safemode enter | leave | get]
[-nameservice enable | disable <nameservice>]
[-getDisabledNameservices]

[hdfs@sandbox-hdp ~]$ hdfs ec
Usage: bin/hdfs ec [COMMAND]
[-listPolicies]
[-addPolicies -policyFile <file>]
[-getPolicy -path <path>]
[-removePolicy -policy <policy>]
[-setPolicy -path <path> [-policy <policy>] [-replicate]]
[-unsetPolicy -path <path>]
[-listCodecs]
[-enablePolicy -policy <policy>]
[-disablePolicy -policy <policy>]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs fsck
Usage: hdfs fsck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | -upgradedomains]]]] [-includeSnapshots] [-showprogress] [-storagepolicies] [-maintenance] [-blockId <blk_Id>]
<path> start checking from this path
-move move corrupted files to /lost+found
-delete delete corrupted files
-files print out files being checked
-openforwrite print out files opened for write
-includeSnapshots include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
-list-corruptfileblocks print out list of missing blocks and files they belong to
-files -blocks print out block report
-files -blocks -locations print out locations for every block
-files -blocks -racks print out network topology for data-node locations
-files -blocks -replicaDetails print out each replica details
-files -blocks -upgradedomains print out upgrade domains for every block
-storagepolicies print out storage policy summary for the blocks
-maintenance print out maintenance state node details
-showprogress show progress in output. Default is OFF (no progress)
-blockId print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)

Please Note:
1. By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually tagged CORRUPT or HEALTHY depending on their block allocation status
2. Option -includeSnapshots should not be used for comparing stats, should be used only for HEALTH check, as this may contain duplicates if the same file present in both original fs tree and inside snapshots.

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs haadmin
Usage: haadmin [-ns <nameserviceId>]
[-transitionToActive [--forceactive] <serviceId>]
[-transitionToStandby <serviceId>]
[-failover [--forcefence] [--forceactive] <serviceId> <serviceId>]
[-getServiceState <serviceId>]
[-getAllServiceState]
[-checkHealth <serviceId>]
[-help <command>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs jmxget
init: server=localhost;port=;service=NameNode;localVMUrl=null

Domains:
Domain = JMImplementation
Domain = com.sun.management
Domain = java.lang
Domain = java.nio
Domain = java.util.logging

MBeanServer default domain = DefaultDomain

MBean count = 22

Query MBeanServer MBeans:
List of all the available keys:
[hdfs@sandbox-hdp ~]$
[hdfs@sandbox-hdp ~]$
[hdfs@sandbox-hdp ~]$
[hdfs@sandbox-hdp ~]$ hdfs oev
Usage: bin/hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE
Offline edits viewer
Parse a Hadoop edits log file INPUT_FILE and save results
in OUTPUT_FILE.
Required command line arguments:
-i,--inputFile <arg> edits file to process, xml (case
insensitive) extension means XML format,
any other filename means binary format.
XML/Binary format input file is not allowed
to be processed by the same type processor.
-o,--outputFile <arg> Name of output file. If the specified
file exists, it will be overwritten,
format of the file is determined
by -p option

Optional command line arguments:
-p,--processor <arg> Select which type of processor to apply
against image file, currently supported
processors are: binary (native binary format
that Hadoop uses), xml (default, XML
format), stats (prints statistics about
edits file)
-h,--help Display usage information and exit
-f,--fix-txids Renumber the transaction IDs in the input,
so that there are no gaps or invalid
transaction IDs.
-r,--recover When reading binary edit logs, use recovery
mode. This will give you the chance to skip
corrupt parts of the edit log.
-v,--verbose More verbose output, prints the input and
output filenames, for processors that write
to a file, also output to screen. On large
image files this will dramatically increase
processing time (default is false).

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$ hdfs oiv
Usage: bin/hdfs oiv [OPTIONS] -i INPUTFILE -o OUTPUTFILE
Offline Image Viewer
View a Hadoop fsimage INPUTFILE using the specified PROCESSOR,
saving the results in OUTPUTFILE.

The oiv utility will attempt to parse correctly formed image files
and will abort fail with mal-formed image files.

The tool works offline and does not require a running cluster in
order to process an image file.

The following image processors are available:
* XML: This processor creates an XML document with all elements of
the fsimage enumerated, suitable for further analysis by XML
tools.
* ReverseXML: This processor takes an XML file and creates a
binary fsimage containing the same elements.
* FileDistribution: This processor analyzes the file size
distribution in the image.
-maxSize specifies the range [0, maxSize] of file sizes to be
analyzed (128GB by default).
-step defines the granularity of the distribution. (2MB by default)
-format formats the output result in a human-readable fashion
rather than a number of bytes. (false by default)
* Web: Run a viewer to expose read-only WebHDFS API.
-addr specifies the address to listen. (localhost:5978 by default)
It does not support secure mode nor HTTPS.
* Delimited (experimental): Generate a text file with all of the elements common
to both inodes and inodes-under-construction, separated by a
delimiter. The default delimiter is \t, though this may be
changed via the -delimiter argument.

Required command line arguments:
-i,--inputFile <arg> FSImage or XML file to process.

Optional command line arguments:
-o,--outputFile <arg> Name of output file. If the specified
file exists, it will be overwritten.
(output to stdout by default)
If the input file was an XML file, we
will also create an <outputFile>.md5 file.
-p,--processor <arg> Select which type of processor to apply
against image file. (XML|FileDistribution|
ReverseXML|Web|Delimited)
The default is Web.
-delimiter <arg> Delimiting string to use with Delimited processor.
-t,--temp <arg> Use temporary dir to cache intermediate result to generate
Delimited outputs. If not set, Delimited processor constructs
the namespace in memory before outputting text.
-h,--help Display usage information and exit

[hdfs@sandbox-hdp ~]$ hdfs oiv_legacy
Usage: bin/hdfs oiv_legacy [OPTIONS] -i INPUTFILE -o OUTPUTFILE
Offline Image Viewer
View a Hadoop fsimage INPUTFILE using the specified PROCESSOR,
saving the results in OUTPUTFILE.

The oiv utility will attempt to parse correctly formed image files
and will abort fail with mal-formed image files.

The tool works offline and does not require a running cluster in
order to process an image file.

The following image processors are available:
* Ls: The default image processor generates an lsr-style listing
of the files in the namespace, with the same fields in the same
order. Note that in order to correctly determine file sizes,
this formatter cannot skip blocks and will override the
-skipBlocks option.
* Indented: This processor enumerates over all of the elements in
the fsimage file, using levels of indentation to delineate
sections within the file.
* Delimited: Generate a text file with all of the elements common
to both inodes and inodes-under-construction, separated by a
delimiter. The default delimiter is , though this may be
changed via the -delimiter argument. This processor also overrides
the -skipBlocks option for the same reason as the Ls processor
* XML: This processor creates an XML document with all elements of
the fsimage enumerated, suitable for further analysis by XML
tools.
* FileDistribution: This processor analyzes the file size
distribution in the image.
-maxSize specifies the range [0, maxSize] of file sizes to be
analyzed (128GB by default).
-step defines the granularity of the distribution. (2MB by default)
-format formats the output result in a human-readable fashion
rather than a number of bytes. (false by default)
* NameDistribution: This processor analyzes the file names
in the image and prints total number of file names and how frequently
file names are reused.

Required command line arguments:
-i,--inputFile <arg> FSImage file to process.
-o,--outputFile <arg> Name of output file. If the specified
file exists, it will be overwritten.

Optional command line arguments:
-p,--processor <arg> Select which type of processor to apply
against image file. (Ls|XML|Delimited|Indented|FileDistribution|NameDistribution).
-h,--help Display usage information and exit
-printToScreen For processors that write to a file, also
output to screen. On large image files this
will dramatically increase processing time.
-skipBlocks Skip inodes' blocks information. May
significantly decrease output.
(default = false).
-delimiter <arg> Delimiting string to use with Delimited processor

[hdfs@sandbox-hdp ~]$ hdfs storagepolicies
Usage: bin/hdfs storagepolicies [COMMAND]
[-listPolicies]
[-setStoragePolicy -path <path> -policy <policy>]
[-getStoragePolicy -path <path>]
[-unsetStoragePolicy -path <path>]
[-help <command-name>]

Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

[hdfs@sandbox-hdp ~]$

My Big Data

Thursday, June 11, 2020

HDFS Admin Commands:

No comments:

Post a Comment