21

Spark 调优(四):Haoop 调优

 4 years ago
source link: https://www.tuicool.com/articles/iMBR7r6
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

HDFS 调优

数据本地化

TODO

YARN 调优

加速应用启动

执行 spark-shell --master yarnspark-submit --master yarn 在 YARN 上启动 Spark 的时候,会将 {SPARK_HOME}/jars 目录下的 JAR 文件压缩成 ZIP 文件,上传至 HDFS /user/{user}/.sparkStaging 应用目录下

为了避免每次启动 Spark 应用都重新分发 JAR,可以通过配置 spark.yarn.jars 指定 JAR 在 HDFS 的路径。

拷贝 Spark 依赖包到 HDFS:

hdfs dfs -copyFromLocal {SPARK_HOME}/jars /lib/spark

编辑 $SPARK_HOME/conf/spark-defaults.conf 文件:

spark.yarn.jars=hdfs://host:port/lib/spark/jars/*.jar

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK