java - Sequence File in Spark - Read And Write Custom -
we had data pipeline earlier built on hadoop. trying port of our application spark.
in our data pipeline had used sequence file op each stage , passed next stage. there custom classes written hadoop implements writable interface storing these data.
if trying use in spark creating object of class , save sequence file getting errors like
text/intwritable or other writable class not serializable. there way save sequence file in spark using these custom classes.
the class present in java , don't want modify sample example
public class abc implements writable,serializable{ private text requestid; private text requesttype; //constructor , other methods @override public void write(dataoutput out) throws ioexception { requestid.write(out); requesttype.write(out); } @override public void readfields(datainput in) throws ioexception { requestid.readfields(in); requesttype.readfields(in); } }
its giving error text object not serializable.
you need make custom class both writable
, serializable
. e.g.
class mytext(var string: string) extends writable serializable { def this() = this("empty") override def write(out: dataoutput): unit = { text.writestring(out, string) } override def readfields(in: datainput) : unit = { string = text.readstring(in) } }
if possible, move away sequence files, , switch parquet, example. can see issues sequence files, scala, on reads sequence files not immutable , can same value objects in collect. see jira ticket.
if have java class implements writable
try creating new class inherits custom class , implements serializable
. e.g.
class mywritableandserializable extends mycustomjavawritable serializable { def this() = super.this() }
Comments
Post a Comment