java - Sequence File in Spark - Read And Write Custom -


we had data pipeline earlier built on hadoop. trying port of our application spark.

in our data pipeline had used sequence file op each stage , passed next stage. there custom classes written hadoop implements writable interface storing these data.

if trying use in spark creating object of class , save sequence file getting errors like

text/intwritable or other writable class not serializable. there way save sequence file in spark using these custom classes.

the class present in java , don't want modify sample example

public class abc implements writable,serializable{     private text requestid;     private text requesttype;      //constructor , other methods     @override     public void write(dataoutput out) throws ioexception {         requestid.write(out);         requesttype.write(out);     }      @override     public void readfields(datainput in) throws ioexception {         requestid.readfields(in);         requesttype.readfields(in);     } } 

its giving error text object not serializable.

you need make custom class both writable , serializable. e.g.

class mytext(var string: string) extends writable serializable {    def this() = this("empty")    override def write(out: dataoutput): unit = {     text.writestring(out, string)   }    override def readfields(in: datainput) : unit = {     string = text.readstring(in)   } } 

if possible, move away sequence files, , switch parquet, example. can see issues sequence files, scala, on reads sequence files not immutable , can same value objects in collect. see jira ticket.

if have java class implements writable try creating new class inherits custom class , implements serializable. e.g.

class mywritableandserializable extends mycustomjavawritable serializable {      def this() = super.this()  } 

Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -