Ever want to export periodically generated hive data to RDBMS?
There may be several solutions available, depends on the data size and update interval. But if you decide to use sqoop1 to export the ******Hive table******, don’t follow those handcraft solution suggestions, try the sqoop1 internal one.
The handcraft solution: copy, export, delete
As title, this kind of solution mainly consist of 3 steps.
- Copy data out from warehouse to general HDFS path
- Run sqoop command against the copied HDFS path
- Delete the copied HDFS path
The sqoop1 internal solution: HCatalog arguments
Run sqoop-export –help and pay attention to the ******HCatalog arguments****** section
HCatalog arguments:
--hcatalog-database <arg> HCatalog database name
--hcatalog-home <hdir> Override $HCAT_HOME
--hcatalog-partition-keys <partition-key> Sets the partition
keys to use when
importing to hive
--hcatalog-partition-values <partition-value> Sets the partition
values to use when
importing to hive
--hcatalog-table <arg> HCatalog table name
--hive-home <dir> Override $HIVE_HOME
--hive-partition-key <partition-key> Sets the partition key
to use when importing
to hive
--hive-partition-value <partition-value> Sets the partition
value to use when
importing to hive
--map-column-hive <arg> Override mapping for
specific column to
hive types.
Note that –hcatalog-partition-keys and –hcatalog-partition-values and –hive-partition-key and –hive-partition-value.
Try to specify partitions and partition values as comma separated list for pural ones.
Even these arguments’ help string is wrongly stated they are ******import****** options, they did the partition filtering job very good when exporting.
评论
发表评论