Access DB2 From Databricks
This took me a good few hours to figure out. So hopefully it will help you and my future self.
- install
com.ibm.db2.jcc:db2jcc:db2jcc4
on your cluster from maven - Get your license file dir (this is a whole process in itself)
- From your license info, copy the jar file (mine is like
db2jcc*.jar
) up to databricks using databricks-cli.- I copied them to a tmp dir and then moved them to
/dbfs/FileStore/jars/maven/com/ibm/db2/jcc/license
from a notebook, but that might not be necessary - You might also have to copy the
.lic
files into the same dir, but, again, I haven't validated that.
- I copied them to a tmp dir and then moved them to
- install that jar on your cluster as a library
- restart your cluster
Then you can run this (python) code:
connection_string = 'jdbc:db2://{host}:{port}/{database}:currentSchema={schema};database={database};user={username};password={password};'.format(
host=host,
port=port,
schema=default_schema,
database=database,
username=username,
password=password
)
rdd = spark.read.format("jdbc") \
.option('url', connection_string) \
.option('driver', 'com.ibm.db2.jcc.DB2Driver') \
.option('dbtable', 'my_table') \
.load()
display(rdd)
Hurrah!