r - How to load csv file into SparkR on RStudio? -


how load csv file sparkr on rstudio? below steps had perform run sparkr on rstudio. have used read.df read .csv not sure how else write this. not sure if step considered create rdds.

#set sys environment variables sys.setenv(spark_home = "c:/users/desktop/spark/spark-1.4.1-bin-hadoop2.6") .libpaths(c(file.path(sys.getenv("spark_home"), "r", "lib"), .libpaths()))  #sys.setenv('sparkr_submit_args'='"--packages" "com.databricks:spark-csv_2.10:1.0.3" "sparkr-shell"')  #load libraries library(sparkr) library(magrittr)  sc <- sparkr.init(master="local") sc <- sparkr.init() sc <- sparkr.init(sparkpackages="com.databricks:spark-csv_2.11:1.0.3") sqlcontext <- sparkrsql.init(sc)  data <- read.df(sqlcontext, "c:/users/desktop/datasets/hello_world.csv", "com.databricks.spark.csv", header="true") 

i getting error:

error in writejobj(con, object) : invalid jobj 1 

spark 2.0.0+:

you can use csv data source:

loaddf(sqlcontext, path="some_path", source="csv", header="true") 

without loading spark-csv.

original answer:

as far can tell you're using wrong version of spark-csv. pre-built versions of spark using scala 2.10, you're using spark csv scala 2.11. try instead:

sc <- sparkr.init(sparkpackages="com.databricks:spark-csv_2.10:1.2.0") 

Comments

Popular posts from this blog

1111. appearing after print sequence - php -

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

Ruby on Rails, ActiveRecord, Postgres, UTF-8 and ASCII-8BIT encodings -