跳至主要内容

博文

目前显示的是 三月, 2019的博文

Spark No suitable driver found for jdbc

 The famous driver not found issue I was request to draw a branch of data out of big data platform to MySQL recently. After some carefully consideration, I decide to code the logic in spark with the DataFrameWriter.write.jdbc() interface. Everything goes well until the integration test, in which I have to install my spark app for a real run. java.sql.SQLException: No suitable driver found for jdbc:mysql://bla:bla/bla Background1: I do follow the spark jdbc guide Actually, I did the simulation by follow the DATABRICKS: Connecting to SQL Databases using JDBC on my own zeppline notebook. And everything went well. Background2: I do follow the MySQL jdbc guide I did follow the official MySQL Connector/j: 6.1 Connecting to MySQL Using the JDBC DriverManager Interface coding sample. // The newInstance() call is a work around for some // broken Java implementations Class.forName("com.mysql.jdbc.Driver").newInstance(); Background3: I do attach the jar dependency I did a

Mysterious sqoop1 export options

Ever want to export periodically generated hive data to RDBMS? There may be several solutions available, depends on the data size and update interval. But if you decide to use sqoop1 to export the ******Hive table******, don’t follow those handcraft solution suggestions, try the sqoop1 internal one. The handcraft solution: copy, export, delete As title, this kind of solution mainly consist of 3 steps. Copy data out from warehouse to general HDFS path Run sqoop command against the copied HDFS path Delete the copied HDFS path The sqoop1 internal solution: HCatalog arguments Run sqoop-export –help and pay attention to the ****** HCatalog arguments****** section HCatalog arguments: --hcatalog-database <arg> HCatalog database name --hcatalog-home <hdir> Override $HCAT_HOME --hcatalog-partition-keys <partition-key> Sets the partition keys to use when importin