python - Parsing all zero sparse vectors with pyspark SparseVectors -


in pyspark, if generate sparse vector represents 0 vector , stringify it works expected:

>>> res = vectors.stringify(sparsevector(4, [], [])) '(4,[],[])' 

but parse method fails load back:

>>> sparsevector.parse(res) traceback (most recent call last):   file "<stdin>", line 1, in <module>   file ".../spark-1.5.2-bin-hadoop2.4/python/pyspark/mllib/linalg/__init__.py", line 545, in parse     raise valueerror("unable parse indices %s." % new_s) valueerror: unable parse indices . 

anyone knows of way solve this?

this bug described spark-14739. simplest workaround use ast module instead:

import ast pyspark.mllib.linalg import sparsevector  def parse_sparse(s):     return sparsevector(*ast.literal_eval(s.strip()))  parse_sparse("(1, [], [])") ## sparsevector(1, {})  parse_sparse("(5, [1, 3], [0.4, -0.1])") ## sparsevector(5, {1: 0.4, 3: -0.1}) 

Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -