python - Parsing all zero sparse vectors with pyspark SparseVectors -
in pyspark, if generate sparse vector represents 0 vector , stringify it works expected:
>>> res = vectors.stringify(sparsevector(4, [], [])) '(4,[],[])'
but parse method fails load back:
>>> sparsevector.parse(res) traceback (most recent call last): file "<stdin>", line 1, in <module> file ".../spark-1.5.2-bin-hadoop2.4/python/pyspark/mllib/linalg/__init__.py", line 545, in parse raise valueerror("unable parse indices %s." % new_s) valueerror: unable parse indices .
anyone knows of way solve this?
this bug described spark-14739. simplest workaround use ast
module instead:
import ast pyspark.mllib.linalg import sparsevector def parse_sparse(s): return sparsevector(*ast.literal_eval(s.strip())) parse_sparse("(1, [], [])") ## sparsevector(1, {}) parse_sparse("(5, [1, 3], [0.4, -0.1])") ## sparsevector(5, {1: 0.4, 3: -0.1})
Comments
Post a Comment