Convert Matrix to RowMatrix in Apache Spark using Scala -
i'd convert org.apache.spark.mllib.linalg.matrix org.apache.spark.mllib.linalg.distributed.rowmatrix
i can such:
val xx = x.computegramianmatrix() //xx type org.apache.spark.mllib.linalg.matrix val xxs = xx.tostring() val xxr = xxs.split("\n").map(row => row.replace(" "," ").replace(" "," ").replace(" "," ").replace(" "," ").replace(" ",",").split(",")) val xxp = sc.parallelize(xxr) val xxd = xxp.map(ar => vectors.dense(ar.map(elm => elm.todouble))) val xxrm: rowmatrix = new rowmatrix(xxd) however, gross , total hack. can show me better way?
note using spark version 1.3.0
i suggest convert matrix rdd[vector] can automatically convert rowmatrix.
let's consider following example :
import org.apache.spark.rdd._ import org.apache.spark.mllib.linalg._ val densedata = seq( vectors.dense(0.0, 1.0, 2.0), vectors.dense(3.0, 4.0, 5.0), vectors.dense(6.0, 7.0, 8.0), vectors.dense(9.0, 0.0, 1.0) ) val dm: matrix = matrices.dense(3, 2, array(1.0, 3.0, 5.0, 2.0, 4.0, 6.0)) you'll need define method convert matrix rdd[vector]
def matrixtordd(m: matrix): rdd[vector] = { val columns = m.toarray.grouped(m.numrows) val rows = columns.toseq.transpose // skip if want column-major rdd. val vectors = rows.map(row => new densevector(row.toarray)) sc.parallelize(vectors) } and can apply conversion on matrix :
import org.apache.spark.mllib.linalg.distributed.rowmatrix val rows = matrixtordd(dm) val mat = new rowmatrix(rows) i hope can help!
Comments
Post a Comment