Apache Spark: class registration order in Kryo -
according kryo documentation classes registered in kryo should have same identifiers assigned during serialization , deserialization
https://github.com/esotericsoftware/kryo#registration "during deserialization, registered classes must have exact same ids had during serialization"
as far know classes registered internally spark , classes registered using method sparkconf.registerkryoclasses have identifiers assigned automatically according registration order change in registration order break possibility of deserialization.
please, me understand how issue handled in apache spark?
from kryo documentation:
during deserialization, registered classes must have exact same ids had during serialization. register method shown above assigns next available, lowest integer id, means order classes registered important.
in other words have same id when de-serializing need have classes registered in same order. order of registration stable in spark.
you need make sure register custom classes using sparkconf.registerkryoclasses()
in stable order.
you can check order of class registration in spark source code on github.
Comments
Post a Comment