admin@onlinelearningcenter.in (+91) 7 999 01 02 03

Join the Most Effective Data Engineering Program 😊

(adsbygoogle = window.adsbygoogle || []).push({}); (adsbygoogle = window.adsbygoogle || []).push({}); (adsbygoogle = window.adsbygoogle || []).push({}); (adsbygoogle = window.adsbygoogle || []).push({}); (adsbygoogle = window.adsbygoogle || []).push({});

Hadoop/HDFS Commands

Hadoop/HDFS Commands:

To use Hadoop commands, you need to first make sure that your Hadoop services are up and running. If you haven’t installed Hadoop yet check out this post.

To start Hadoop services, use the following command:

$ start-all.sh

How to check Hadoop services are up and running, use the below command:

$ jps

So, let’s start with basics commands:

  1. Create a directory in HDFS.

    Syntax-

    $ hdfs dfs -mkdir 

    Example-

    $ hadoop fs -mkdir data1  --- Old approach in hadoop 1.x
    $ hdfs dfs -mkdir data11   --- New approach in hadoop 2.x
    $ hadoop fs -mkdir -p data2/data3/data4  --- Folder created Recursively

     Note: -p: It creates an inner and outer folder (first creates data2 folder then inside data3 folder and again creates data4 folder inside data3)

  2. -ls command.

    -ls command is used to list files out of the directory in HDFS: 

    Syntax-

    $ hdfs dfs -ls  

    Example 1-

    $ hdfs dfs -ls

    Example 2 with path-

    $ hadoop fs -ls data2/data3
    Found 1 items
    drwxr-xr-x   - hduser supergroup   0 2022-03-30 05:33 data2/data3/data4
    
  3. Display files and folders Recursively.

    Example-
    $ hadoop fs -ls -R /user/hduser
    drwxr-xr-x   - hduser supergroup   0 2022-03-30 05:29 /user/hduser/data1
    drwxr-xr-x   - hduser supergroup   0 2022-03-30 05:33 /user/hduser/data2
    drwxr-xr-x   - hduser supergroup   0 2022-03-30 05:33 /user/hduser/data2/data3
    drwxr-xr-x   - hduser supergroup   0 2022-03-30 05:33 /user/hduser/data2/data3/data4

    -R: is used to display Recursively.

  4. Create a file called “word1.txt” in “/home/hduser/” with some data.

    $ echo "Hello World"> word1.txt
    $ cat word1.txt
    Hello World

     

  5. Copy Single & Multiple File From Local To HDFS. 

    This command is very useful in Hadoop. There are 2-ways to use the command i.e.,“-put” and “-copyFromLocal”.

    Syntax- 

    $ hdfs dfs -put local path hdfs path
    $ hdfs dfs -copyFromLocal local path hdfs path

     Example-

     We will copy the above file "word1.txt" to HDFS.

    $ hadoop fs -put /home/hduser/Desktop/word1.txt      word2.txt
    $ hadoop fs -ls /user/hduser  -- word2.txt is stored in HDFS

     

     Example Multiple File-

    Create a file with some data:
    echo "word3.txt with echo command" > word3.txt
    vim word4.txt  -- put some data and save it.
    
    // Create a Folder
    $ hadoop fs -mkdir /worddata
    
    // Copy word3.txt & word4.txt files to worddata folder
    $ hadoop fs -put word3.txt word4.txt /worddata

  6. cat command.

    "cat" command is used to display the content of the file.

    $ cat word1.txt
    Hello World   --- Display the content

     

  7. -get (or) -copyToLocal:

    This command copies data or folders from the HDFS location to the Local files system.

     Syntax-

    $ hadoop fs -get    hdfs_file_path      local_file_path
    (OR)
    $ hadoop fs -copyToLocal   local_file_path       hdfs_file_path

    Example-

    // copy word3.txt from HDFS location to Local location
    $ hadoop fs -get /user/hduser/word3.txt   /home/hduser/word3.txt
    
    // Create a directory.
    $ mkdir ~/myDir
    
    // copy word3.txt file to myDir folder present in local file system.
    $ hadoop fs -copyToLocal  /user/hduser/word3.txt   /home/hduser/myDir

  8. Setting Replication on a particular file.

    -setrep is a command to set the replication factor of a file(s)/directory in HDFS. By default, it is 3. 

    Example-

    // setting 2 replica on word3.txt file.
    $ hadoop fs -setrep 2 /user/hduser/word3.txt

     

     Now use  -ls to check

    $ hadoop fs -ls /user/hduser/word3.txt

  9. -cp command:

    “-cp” command can be used to copy the source into the target within the HDFS cluster. It can also be used to copy multiple files.

    The below example shows to copy a file from the HDFS location to another directory location.

    Syntax-

    $ hdfs dfs -cp (hdfs source)   (hdfs dest.)

    Example-

    // Create a folder called logs in hduser.
    $ hadoop fs -mkdir /user/hduser/logs
    
    // copy word3.txt to logs folder as word3copy.txt
    $ hadoop fs -cp /user/hduser/word3.txt  /user/hduser/logs/word3copy.txt
    
    $ hadoop fs -ls logs
    Found 1 items
    -rw-r--r--   1 hduser supergroup   28 2022-04-11 22:16 logs/word3copy.txt

    Now, to copy multiple files into a single target directory. The target directory must exist. Below is an example.

    // Create a folder called data in hdfs
    $ hadoop fs -mkdir data
    
    // copy word1.txt and word3.txt to data folder.
    $ hadoop fs -cp word1.txt word3.txt   hdfs:///user/hduser/data
    
    $ hadoop fs -ls data
    Found 2 items
    -rw-r--r--   1 hduser supergroup   12 2022-04-11 22:31 data/word1.txt
    -rw-r--r--   1 hduser supergroup   28 2022-04-11 22:31 data/word3.txt
    
    // copy all the files starting with “word”.
    $ hadoop fs -cp word*.txt  hdfs:///user/hduser/data

  10. -mv command:

    It moves the files from source HDFS to destination HDFS. It can also be used to move multiple source files into the target directory that exists.

    // Create a folder called “wordcount” in HDFS.
    $ hadoop fs -mkdir wordcount
    
    // move word2.txt & word3.txt inside wordcount folder.
    $ hadoop fs -mv /user/hduser/word2.txt  word3.txt  /user/hduser/wordcount
    
    // Create another folder called “opt” in HDFS.
    $ hadoop fs -mkdir opt
    
    // Move all the files that starts with the word “word” to “opt” folder.
    $ hadoop fs -mv wordcount/word*.txt  /user/hduser/opt/
    
    $ hadoop fs -ls opt
    Found 2 items
    -rw-r--r--  1 hduser supergroup   12 2022-03-30 18:35 opt/word2.txt
    -rw-r--r--  2 hduser supergroup   28 2022-04-04 12:39 opt/word3.txt 

  11. -chmod command:

    hdfs dfs “-chmod” command is used to change the permissions of files. The “-R” option can be used to recursively change the permissions of a directory structure.

    Syntax-
    $ hadoop fs -chmod [-R]     mode | octal mode    file or directory name
    Example-
    $ hadoop fs -chmod -R 770 /user/hduser/word1.txt
    
    $ hadoop fs -ls word1.txt
    -rwxrwx---   1 hduser supergroup         12 2022-04-12 11:00 word1.txt

  12. -chown command:

    hadoop fs “-chown” command is used to change the ownership of the files. The -R option can be used to recursively change the owner of a directory structure.

    Syntax-
    hadoop fs -chown [-R]   NewOwnerName   [:NewGroupName]   file or directory
    Example-
    $ hadoop fs -ls word1.txt
    -rwxrwx---  1 hduser supergroup   12 2022-04-12 11:00 word1.txt
    
    // Changing ownership of word1.txt file.
    $ hadoop fs -chown -R sumeet:supergroup /user/hduser/word1.txt
    
    $ hadoop fs -ls word1.txt
    -rwxrwx---  1 sumeet supergroup   12 2022-04-12 11:00 word1.txt

Published By : Sumeet Vishwakarma
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Comments