scala - Spark ClosedChannelException exception during parquet write -


we have massive legacy sql table need extract data out of , pushing s3. below how i'm querying portion of data , writing output.

  def writetableinparts(tablename: string, numidsperparquet: long, numpartitionsatatime: int, startfrom : long = -1, endto : long = -1, fileprefix : string = s3prefix) = {     val minid : long = if (startfrom > 0) startfrom else findmincol(tablename, "id")     val maxid : long = if (endto > 0) endto else findmaxcol(tablename, "id")      (minid until maxid numidsperparquet).tolist.sliding(numpartitionsatatime, numpartitionsatatime).tolist.foreach(list => {       list.map(start => {           val end = math.min(start + numidsperparquet, maxid)            sqlcontext.read.jdbc(mysqlconstr,             s"(select * $tablename id >= ${start} , id < ${end}) tmptable",             map[string, string]())         }).reduce((left, right) => {           left.unionall(right)         })         .write         .parquet(s"${fileprefix}/$tablename/${list.head}-${list.last + numidsperparquet}")     })   } 

this has worked many different tables whatever reason table continues java.nio.channels.closedchannelexception no matter how reduce scanning window or size.

based on this answer guess have exception somewhere in code i'm not sure rather simple code. how can further debug exception? logs didn't have helfpul , doen't reveal cause.

problem due below error, not spark related... cumbersome chase down spark isn't @ displaying errors. darn...

'0000-00-00 00:00:00' can not represented java.sql.timestamp error


Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -