Home

Configuring notebooks for Spark

Notebook programs like Zepplin and Jupyter overwrite the spark.driver.extraJavaOptions which prevents the data from apps run being captured by Unravel. To ensure the data is loaded into Unravel you must set the following properties in your notebook program.

  • SPARK_HOME: The complete path to your Spark library.

  • spark.driver.extraJavaOptions: Spark driver options.

  • spark.executor.extraJavaOptions: Spark executor options.

For example,

  • SPARK_HOME: /opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/spark

  • spark.driver.extraJavaOptions: -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///root/zeppelin-0.8.2-bin-all/conf/log4j.properties -Dzeppelin.log.file=/root/zeppelin-0.8.2-bin-all/logs/zeppelin-interpreter-spark-root-tnode113.unraveldata.com.log -javaagent:unravel-agent-pack-bin.zip/btrace-agent.jar=libs=spark-$SPARK_VERSION,config=driver

  • spark.executor.extraJavaOptions: -javaagent:-javaagent:unravel-agent-pack-bin.zip/btrace-agent.jar=libs=spark-$SPARK_VERSION,config=executor