Dear Gowtham,

Hope you are doing great.

Please take a look on the below details for your query and please let us know if you have any concern so i can help you out.

In case of the Production Environment it depends on the company if they need to use a tool or they can also use any Open source also.

Minimum Prerequisites
 
Java 1.6 from Oracle, version 1.6 update 8 or later; identify your current JAVA_HOME
 
sshd and ssh for managing Hadoop daemons across multiple systems
 
rsync for file and directory synchronization across the nodes in the cluster
 
Create a service account for user hadoop where $HOME=/home/hadoop
 
SSH Access
 
Every system in a Hadoop deployment must provide SSH access for data exchange between nodes.
 
Log in to the node as the Hadoop user and run the commands in Listing 1 to validate or create the required SSH configuration.

Submit Hadoop Jobs Interactively

In addition to adding steps to a cluster, you can connect to the master node using an SSH client or the AWS CLI and interactively submit Hadoop jobs. For example, you can use PuTTY to establish an SSH connection with the master node and submit interactive Hive queries which are compiled into one or more Hadoop jobs.

The following examples demonstrate interactively submitting Hadoop jobs and Hive jobs to the master node. The process for submitting jobs for other programming frameworks (such as Pig) is similar to these examples.

To submit Hadoop jobs interactively using the AWS CLI

You can submit Hadoop jobs interactively using the AWS CLI by establishing an SSH connection in the CLI command (using the ssh subcommand).

To copy a JAR file from your local Windows machine to the master node's file system, type the following command. Replace j-2A6HXXXXXXL7J with your cluster ID, replacemykey.ppk with the name of your key pair file, and replacemyjar.jar with the name of your JAR file.

aws emr put --cluster-id j-2A6HXXXXXXL7J --key-pair-file "C:\Users\username\Desktop\Keys\mykey.ppk" --src "C:\Users\username\myjar.jar"

To create an SSH connection and submit the Hadoop jobmyjar.jar, type the following command.

aws emr ssh --cluster-id j-2A6HXXXXXXL7J --key-pair-file "C:\Users\username\Desktop\Keys\mykey.ppk" --command "hadoop jar myjar.jar"


FTP Tools : ( May be used by some companies but mostly we will use SSH connection )



Production environments are deployed across a group of machines that make the computational network. 
 
Hadoop must be configured to run in fully distributed, clustered mode.

Note : Linux is the supported platform for production systems. Windows is adequate but is not supported as a development platform.


How Hadoop works on the Development Environment.
 
 
 
We hope this will solve your issue, please let me know if you need more details.

182014