dfs.namenode.logging.level
info
The logging level for dfs namenode. Other values are "dir"(trac
e namespace mutations), "block"(trace block under/over replications and block
creations/deletions), or "all".
dfs.namenode.rpc-address
RPC address that handles all clients requests. If empty then we'll get the
value from fs.default.name.
The value of this property will take the form of hdfs://nn-host1:rpc-port.
dfs.secondary.http.address
0.0.0.0:50090
The secondary namenode http server address and port.
If the port is 0 then the server will start on a free port.
dfs.datanode.address
0.0.0.0:50010
The datanode server address and port for data transfer.
If the port is 0 then the server will start on a free port.
dfs.datanode.http.address
0.0.0.0:50075
The datanode http server address and port.
If the port is 0 then the server will start on a free port.
dfs.datanode.ipc.address
0.0.0.0:50020
The datanode ipc server address and port.
If the port is 0 then the server will start on a free port.
dfs.datanode.handler.count
3
The number of server threads for the datanode.
dfs.http.address
0.0.0.0:50070
The address and the base port where the dfs namenode web ui will listen on.
If the port is 0 then the server will start on a free port.
dfs.https.enable
false
Decide if HTTPS(SSL) is supported on HDFS
dfs.https.need.client.auth
false
Whether SSL client certificate authentication is required
dfs.https.server.keystore.resource
ssl-server.xml
Resource file from which ssl server keystore
information will be extracted
dfs.https.client.keystore.resource
ssl-client.xml
Resource file from which ssl client keystore
information will be extracted
dfs.datanode.https.address
0.0.0.0:50475
dfs.https.address
0.0.0.0:50470
dfs.datanode.dns.interface
default
The name of the Network Interface from which a data node should
report its IP address.
dfs.datanode.dns.nameserver
default
The host name or IP address of the name server (DNS)
which a DataNode should use to determine the host name used by the
NameNode for communication and display purposes.
dfs.replication.considerLoad
true
Decide if chooseTarget considers the target's load or not
dfs.default.chunk.view.size
32768
The number of bytes to view for a file on the browser.
dfs.datanode.du.reserved
0
Reserved space in bytes per volume. Always leave this much space free for non dfs use.
dfs.namenode.name.dir
${hadoop.tmp.dir}/dfs/name
Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy.
dfs.name.edits.dir
${dfs.name.dir}
Determines where on the local filesystem the DFS name node
should store the transaction (edits) file. If this is a comma-delimited list
of directories then the transaction file is replicated in all of the
directories, for redundancy. Default value is same as dfs.name.dir
dfs.namenode.edits.toleration.length
0
The length in bytes that namenode is willing to tolerate when the edit log
is corrupted. The edit log toleration feature checks the entire edit log.
It computes read length (the length of valid data), corruption length and
padding length. In case that corruption length is non-zero, the corruption
will be tolerated only if the corruption length is less than or equal to
the toleration length.
For disabling edit log toleration feature, set this property to -1. When
the feature is disabled, the end of edit log will not be checked. In this
case, namenode will startup normally even if the end of edit log is
corrupted.
dfs.web.ugi
webuser,webgroup
The user account used by the web interface.
Syntax: USERNAME,GROUP1,GROUP2, ...
dfs.permissions
true
If "true", enable permission checking in HDFS.
If "false", permission checking is turned off,
but all other behavior is unchanged.
Switching from one parameter value to the other does not change the mode,
owner or group of files or directories.
dfs.permissions.supergroup
supergroup
The name of the group of super-users.
dfs.block.access.token.enable
false
If "true", access tokens are used as capabilities for accessing datanodes.
If "false", no access tokens are checked on accessing datanodes.
dfs.block.access.key.update.interval
600
Interval in minutes at which namenode updates its access keys.
dfs.block.access.token.lifetime
600
The lifetime of access tokens in minutes.
dfs.datanode.data.dir
${hadoop.tmp.dir}/dfs/data
Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited
list of directories, then data will be stored in all named
directories, typically on different devices.
Directories that do not exist are ignored.
dfs.datanode.data.dir.perm
755
Permissions for the directories on on the local filesystem where
the DFS data node store its blocks. The permissions can either be octal or
symbolic.
dfs.replication
3
Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
dfs.replication.max
512
Maximal block replication.
dfs.replication.min
1
Minimal block replication.
dfs.block.size
67108864
The default block size for new files.
dfs.df.interval
60000
Disk usage statistics refresh interval in msec.
dfs.client.block.write.retries
3
The number of retries for writing blocks to the data nodes,
before we signal failure to the application.
dfs.blockreport.intervalMsec
3600000
Determines block reporting interval in milliseconds.
dfs.blockreport.initialDelay 0
Delay for first block report in seconds.
dfs.heartbeat.interval
3
Determines datanode heartbeat interval in seconds.
dfs.namenode.handler.count
10
The number of server threads for the namenode.
dfs.safemode.threshold.pct
0.999f
Specifies the percentage of blocks that should satisfy
the minimal replication requirement defined by dfs.replication.min.
Values less than or equal to 0 mean not to wait for any particular
percentage of blocks before exiting safemode.
Values greater than 1 will make safe mode permanent.
dfs.namenode.safemode.min.datanodes
0
Specifies the number of datanodes that must be considered alive
before the name node exits safemode.
Values less than or equal to 0 mean not to take the number of live
datanodes into account when deciding whether to remain in safe mode
during startup.
Values greater than the number of datanodes in the cluster
will make safe mode permanent.
dfs.safemode.extension
30000
Determines extension of safe mode in milliseconds
after the threshold level is reached.
dfs.balance.bandwidthPerSec
1048576
Specifies the maximum amount of bandwidth that each datanode
can utilize for the balancing purpose in term of
the number of bytes per second.
dfs.hosts
Names a file that contains a list of hosts that are
permitted to connect to the namenode. The full pathname of the file
must be specified. If the value is empty, all hosts are
permitted.
dfs.hosts.exclude
Names a file that contains a list of hosts that are
not permitted to connect to the namenode. The full pathname of the
file must be specified. If the value is empty, no hosts are
excluded.
dfs.max.objects
0
The maximum number of files, directories and blocks
dfs supports. A value of zero indicates no limit to the number
of objects that dfs supports.
dfs.namenode.decommission.interval
30
Namenode periodicity in seconds to check if decommission is
complete.
dfs.namenode.decommission.nodes.per.interval
5
The number of nodes namenode checks if decommission is complete
in each dfs.namenode.decommission.interval.
dfs.replication.interval
3
The periodicity in seconds with which the namenode computes
repliaction work for datanodes.
dfs.access.time.precision
3600000
The access time for HDFS file is precise upto this value.
The default value is 1 hour. Setting a value of 0 disables
access times for HDFS.
dfs.support.append
This option is no longer supported. HBase no longer requires that
this option be enabled as sync is now enabled by default. See
HADOOP-8230 for additional information.
dfs.namenode.delegation.key.update-interval
86400000
The update interval for master key for delegation tokens
in the namenode in milliseconds.
dfs.namenode.delegation.token.max-lifetime
604800000
The maximum lifetime in milliseconds for which a delegation
token is valid.
dfs.namenode.delegation.token.renew-interval
86400000
The renewal interval for delegation token in milliseconds.
dfs.datanode.failed.volumes.tolerated
0
The number of volumes that are allowed to
fail before a datanode stops offering service. By default
any volume failure will cause a datanode to shutdown.
dfs.datanode.max.xcievers
4096
Specifies the maximum number of threads to use for transferring data
in and out of the DN.
dfs.datanode.readahead.bytes
4193404
While reading block files, if the Hadoop native libraries are available,
the datanode can use the posix_fadvise system call to explicitly
page data into the operating system buffer cache ahead of the current
reader's position. This can improve performance especially when
disks are highly contended.
This configuration specifies the number of bytes ahead of the current
read position which the datanode will attempt to read ahead. This
feature may be disabled by configuring this property to 0.
If the native libraries are not available, this configuration has no
effect.
dfs.datanode.drop.cache.behind.reads
false
In some workloads, the data read from HDFS is known to be significantly
large enough that it is unlikely to be useful to cache it in the
operating system buffer cache. In this case, the DataNode may be
configured to automatically purge all data from the buffer cache
after it is delivered to the client. This behavior is automatically
disabled for workloads which read only short sections of a block
(e.g HBase random-IO workloads).
This may improve performance for some workloads by freeing buffer
cache spage usage for more cacheable data.
If the Hadoop native libraries are not available, this configuration
has no effect.
dfs.datanode.drop.cache.behind.writes
false
In some workloads, the data written to HDFS is known to be significantly
large enough that it is unlikely to be useful to cache it in the
operating system buffer cache. In this case, the DataNode may be
configured to automatically purge all data from the buffer cache
after it is written to disk.
This may improve performance for some workloads by freeing buffer
cache spage usage for more cacheable data.
If the Hadoop native libraries are not available, this configuration
has no effect.
dfs.datanode.sync.behind.writes
false
If this configuration is enabled, the datanode will instruct the
operating system to enqueue all written data to the disk immediately
after it is written. This differs from the usual OS policy which
may wait for up to 30 seconds before triggering writeback.
This may improve performance for some workloads by smoothing the
IO profile for data written to disk.
If the Hadoop native libraries are not available, this configuration
has no effect.
dfs.client.use.datanode.hostname
false
Whether clients should use datanode hostnames when
connecting to datanodes.
dfs.datanode.use.datanode.hostname
false
Whether datanodes should use datanode hostnames when
connecting to other datanodes for data transfer.
dfs.client.local.interfaces
A comma separated list of network interface names to use
for data transfer between the client and datanodes. When creating
a connection to read from or write to a datanode, the client
chooses one of the specified interfaces at random and binds its
socket to the IP of that interface. Individual names may be
specified as either an interface name (eg "eth0"), a subinterface
name (eg "eth0:0"), or an IP address (which may be specified using
CIDR notation to match a range of IPs).
dfs.image.transfer.bandwidthPerSec
0
Specifies the maximum amount of bandwidth that can be utilized
for image transfer in term of the number of bytes per second.
A default value of 0 indicates that throttling is disabled.
dfs.webhdfs.enabled
false
Enable WebHDFS (REST API) in Namenodes and Datanodes.
dfs.namenode.kerberos.internal.spnego.principal
${dfs.web.authentication.kerberos.principal}
dfs.secondary.namenode.kerberos.internal.spnego.principal
${dfs.web.authentication.kerberos.principal}
dfs.namenode.invalidate.work.pct.per.iteration
0.32f
*Note*: Advanced property. Change with caution.
This determines the percentage amount of block
invalidations (deletes) to do over a single DN heartbeat
deletion command. The final deletion count is determined by applying this
percentage to the number of live nodes in the system.
The resultant number is the number of blocks from the deletion list
chosen for proper invalidation over a single heartbeat of a single DN.
Value should be a positive, non-zero percentage in float notation (X.Yf),
with 1.0f meaning 100%.
dfs.namenode.replication.work.multiplier.per.iteration
2
*Note*: Advanced property. Change with caution.
This determines the total amount of block transfers to begin in
parallel at a DN, for replication, when such a command list is being
sent over a DN heartbeat by the NN. The actual number is obtained by
multiplying this multiplier with the total number of live nodes in the
cluster. The result number is the number of blocks to begin transfers
immediately for, per DN heartbeat. This number can be any positive,
non-zero integer.
dfs.namenode.avoid.read.stale.datanode
false
Indicate whether or not to avoid reading from "stale" datanodes whose
heartbeat messages have not been received by the namenode
for more than a specified time interval. Stale datanodes will be
moved to the end of the node list returned for reading. See
dfs.namenode.avoid.write.stale.datanode for a similar setting for writes.
dfs.namenode.avoid.write.stale.datanode
false
Indicate whether or not to avoid writing to "stale" datanodes whose
heartbeat messages have not been received by the namenode
for more than a specified time interval. Writes will avoid using
stale datanodes unless more than a configured ratio
(dfs.namenode.write.stale.datanode.ratio) of datanodes are marked as
stale. See dfs.namenode.avoid.read.stale.datanode for a similar setting
for reads.
dfs.namenode.stale.datanode.interval
30000
Default time interval for marking a datanode as "stale", i.e., if
the namenode has not received heartbeat msg from a datanode for
more than this time interval, the datanode will be marked and treated
as "stale" by default. The stale interval cannot be too small since
otherwise this may cause too frequent change of stale states.
We thus set a minimum stale interval value (the default value is 3 times
of heartbeat interval) and guarantee that the stale interval cannot be less
than the minimum value.
dfs.namenode.write.stale.datanode.ratio
0.5f
When the ratio of number stale datanodes to total datanodes marked
is greater than this ratio, stop avoiding writing to stale nodes so
as to prevent causing hotspots.
dfs.datanode.plugins
Comma-separated list of datanode plug-ins to be activated.
dfs.namenode.plugins
Comma-separated list of namenode plug-ins to be activated.