Spark workflow fails for not able to access hive table
This is an odd issue. With the same spark program, it will fail to access hive table if scheduled by Ozzie, but runs well if run manually by using spark-submit.
Solution
Add hive-site.xml to the spark workflow to make sure hive context is correctly initialized.
Put the hive-site.xml into hdfs
On one of the cluster node, find the hive configuration XML ‘hive-site.xml’ from /etc/hive/conf. Copy it to somewhere on hdfs.
Add the hive-site.xml as one of the “FILES” of spark workflow
In Hue workflow editor, click the plus sign on “FILES” to add a new “FILE” element. Write the corresponding hdfs path of just copied ‘hive-site.xml’.
评论
发表评论