SQLITE_ERROR: Connection is closed when connecting from Spark via JDBC to SQLite database -
i using apache spark 1.5.1 , trying connect local sqlite database named clinton.db
. creating data frame table of database works fine when operations on created object, error below says "sql error or missing database (connection closed)". funny thing result of operation nevertheless. idea can solve problem, i.e., avoid error?
start command spark-shell:
../spark/bin/spark-shell --master local[8] --jars ../libraries/sqlite-jdbc-3.8.11.1.jar --classpath ../libraries/sqlite-jdbc-3.8.11.1.jar
reading database:
val emails = sqlcontext.read.format("jdbc").options(map("url" -> "jdbc:sqlite:../data/clinton.sqlite", "dbtable" -> "emails")).load()
simple count (fails):
emails.count
error:
15/09/30 09:06:39 warn jdbcrdd: exception closing statement java.sql.sqlexception: [sqlite_error] sql error or missing database (connection closed) @ org.sqlite.core.db.newsqlexception(db.java:890) @ org.sqlite.core.corestatement.internalclose(corestatement.java:109) @ org.sqlite.jdbc3.jdbc3statement.close(jdbc3statement.java:35) @ org.apache.spark.sql.execution.datasources.jdbc.jdbcrdd$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$jdbcrdd$$anon$$close(jdbcrdd.scala:454) @ org.apache.spark.sql.execution.datasources.jdbc.jdbcrdd$$anon$1$$anonfun$8.apply(jdbcrdd.scala:358) @ org.apache.spark.sql.execution.datasources.jdbc.jdbcrdd$$anon$1$$anonfun$8.apply(jdbcrdd.scala:358) @ org.apache.spark.taskcontextimpl$$anon$1.ontaskcompletion(taskcontextimpl.scala:60) @ org.apache.spark.taskcontextimpl$$anonfun$marktaskcompleted$1.apply(taskcontextimpl.scala:79) @ org.apache.spark.taskcontextimpl$$anonfun$marktaskcompleted$1.apply(taskcontextimpl.scala:77) @ scala.collection.mutable.resizablearray$class.foreach(resizablearray.scala:59) @ scala.collection.mutable.arraybuffer.foreach(arraybuffer.scala:47) @ org.apache.spark.taskcontextimpl.marktaskcompleted(taskcontextimpl.scala:77) @ org.apache.spark.scheduler.task.run(task.scala:90) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:214) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) @ java.lang.thread.run(thread.java:745) res1: long = 7945
i got same error today, , important line before exception:
15/11/30 12:13:02 info jdbc.jdbcrdd: closed connection
15/11/30 12:13:02 warn jdbc.jdbcrdd: exception closing statement java.sql.sqlexception: [sqlite_error] sql error or missing database (connection closed) @ org.sqlite.core.db.newsqlexception(db.java:890) @ org.sqlite.core.corestatement.internalclose(corestatement.java:109) @ org.sqlite.jdbc3.jdbc3statement.close(jdbc3statement.java:35) @ org.apache.spark.sql.execution.datasources.jdbc.jdbcrdd$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$jdbcrdd$$anon$$close(jdbcrdd.scala:454)
so spark succeeded close jdbc connection, , fails close jdbc statement
looking @ source, close()
called twice:
line 358 (org.apache.spark.sql.execution.datasources.jdbc.jdbcrdd, spark 1.5.1)
context.addtaskcompletionlistener{ context => close() }
line 469
override def hasnext: boolean = { if (!finished) { if (!gotnext) { nextvalue = getnext() if (finished) { close() } gotnext = true } } !finished }
if @ close()
method (line 443)
def close() { if (closed) return
you can see checks variable closed
, value never set true.
if see correctly, bug still in master. have filed bug report.
- source: jdbcrdd.scala (lines numbers differ slightly)
Comments
Post a Comment