This tutorial helps you to learn to manage our files on HDFS in Hadoop. You will learn how to create, upload, download and list contents in HDFS. Below commands will help you to how to create a directory structure in HDFS, Copy files from local file system to HDFS and download files from HDFS to local files. Also how to do manage files in HDFS.
Create Directory in HDFS
Takes the path URI’s like an argument and creates a directory or multiple directories.
hdfs dfs -mkdir
Remember that you must create a home directory in HDFS with your system’s username. For example, you are logged in as hduser on your system, So first create /user/hduser else you will get this error, Now create directory structure inside it
hdfs dfs -mkdir /user/hduser hdfs dfs -mkdir /user/hduser/input hdfs dfs -mkdir /user/hduser/output hdfs dfs -mkdir /user/hduser/input/text /user/hadoop/input/xml
Copy Files to HDFS
After creating directory structure, Now put some files to HDFS from your local file system.
hdfs dfs -put LOCAL_FILE HDFS_PATH
For example you have test1.txt in current directory and /tmp/test2.xml on your local file system.
hdfs dfs -put text1.txt /user/hduser/input/text/ hdfs dfs -put /tmp/text2.xml /user/hduser/input/xml/
List Files from HDFS
Use the following example commands to list the content of the directory in HDFS.
hdfs dfs -ls /user/hduser hdfs dfs -ls /user/hduser/input/ hdfs dfs -ls /user/hduser/input/text/
Use -R to list files recursively inside directories. For example:
hdfs dfs -ls -R /user/hadoop/input/
Download Files from HDFS
At this point, you have learned how to copy and list files to HDFS. Now use following example commands to how to download/Copy files from HDFS to the local file system.
hdfs dfs -get /user/hduser/input/text/test1.txt /tmp/ hdfs dfs -get /user/hadoop/dir1/xml/test2.xml /tmp/
here /tmp is on system’s local file system.
Copy Files between HDFS Directories
You can easily copy files between HDFS file system using distcp option.
hdfs distcp /user/hduser/input/xml/text2.xml /user/hduser/output hdfs distcp /user/hduser/input/text/text1.xml /user/hduser/output
1 Comment
Hi
This tutorial is very useful for me..I followed this for my hadoop installation…but while running wordcount example – javac -classpath $(HADOOP_CLASSPATH) -d ‘/home/hduser/Desktop/WordCountTutorial/tutorial_classes’ ‘/home/hduser/Desktop/WordCountTutorial/WordCount.java’ i am facing with the error
HADOOP_CLASSPATH: command not found
javac: invalid flag: /home/hduser/Desktop/WordCountTutorial/tutorial_classes
Usage: javac
use -help for a list of possible options
so how can i fix the problem??
Thankyou