跳至主要内容

博文

目前显示的是标签为“python”的博文

Workaround a jython runtime bug by interop with Java java.time

TypeError: unsupported operand type(s) for +: 'java.sql.Timestamp' and 'timedelta' I was working on a NIFI script recently. As the following code snip shows, I was trying to construct a series of timestamped string with Python’s datetime. import os.path from datetime import datetime, time, timedelta import subprocess now = datetime.now() dirs = [ (now + timedelta(milliseconds=(-100 * x))).strftime( '%Y%m%d%H/%M/%S/%f' )[0:-5] for x in range (0,100)] fullpaths = [ '/workspace/nifi_output/{}' . format (d) for d in dirs] existspaths = [fp for fp in fullpaths if os.path.exists(fp)] flowFile = session.get() if (flowFile == None ): flowFile = session.create() ep_length = len (existspaths) if ep_length > 1: session.putAttribute(flowFile, 'batch_path' , existspaths[1]) session.transfer(flowFile, REL_SUCCESS) elif ep_length == 1: session.putAttribute(flowFile, 'batch_path' , existspaths[0]) sessi...

Working with case insensitive Json in python

We want the json keys to be case-insensitive Okay, I hope this is not your case, as it is too weird to work this way. But If you have to. Or, If you must to. Solution is to: Just working in a specific case, upper or lower, and blindly convert all data to that case. To do this in Python, when you load a JSON file/string you can define a hook named object_pairs_hook as below: import pathlib my_file = pathlib.Path.cwd().join( 'my_example.json' ) def lower_case_keys (tuple_list): return dict ([(k.lower(), v) for k, v in tuple_list]) json.loads(my_file.read_text(encoding= 'UTF-8' ), object_pairs_hook=lower_case_keys) From here on, all your json operation should be using lower-cased keys.

Exchange data between zeppelin pyspark and spark session

Problem: dataframe are not shared? As of version Zepplin(0.7.0), Spark dataframe are not shared between %pyspark (python) and %spark (scala) session. Solution: exchange by using temporary table do the following #%pyspark somedf.registerTempTable("somedftable") and then rebuild the DataFrame in scala session //%scala val somedf = sqlContext.table("somedftable") z.show(somedf.limit(20))