Configuration files

Filed under: data

Meta list of files edited

Been editing lots of files to set up software on EC2. Making a short list here as a visual guide.

Hadoop

export HADOOP_HOME = /usr/local/hadoop
HADOOP_HOME/etc/hadoop/hadoop-env.sh
HADOOP_HOME/etc/hadoop/yarn-site.xml
HADOOP_HOME/etc/hadoop/mapred-site.xml   #
HADOOP_HOME/etc/hadoop/hdfs-site.xml     # storage location
HADOOP_HOME/etc/hadoop/masters
HADOOP_HOME/etc/hadoop/slaves
/etc/hosts

Spark

export SPARK_HOME = /usr/local/spark
SPARK_HOME/conf/spark-env.sh
SPARK_HOME/conf/slaves

Zookeeper

/usr/local/zookeeper/conf/zoo.cfg

Kafka

export KAFKA_HOME = /usr/local/kafka
KAFKA_HOME/config/server.properties
KAFKA_HOME/bin/kafka-server-start.sh

Storm

export STORM_HOME = /usr/local/storm
STORM_HOME/config/storm.yaml              # id zookeeper nodes

Start zookeeper first with STORM_HOME/bin/zkServer.sh <start|stop> then on

  • Master: sudo STORM_HOME/bin/storm nimbus &
    • UI: sudo STORM_HOME/bin/storm ui &
  • Worker: sudo STORM_HOME/bin/storm supervisor &

Storm Topology

export MAVEN_HOME=/usr/local/maven
export PATH=PATH:MAVEN_HOME/bin

Spark

export SPARK_HOME = /usr/local/spark
SPARK_HOME/config/server.properties
SPARK_HOME/bin/kafka-server-start.sh

Samza

export SAMZA_HOME = /usr/local/samza
SAMZA_HOME/config/server.properties
SAMZA_HOME/bin/kafka-server-start.sh

*Updated 9/21 to include storm, samza, spark