Reading from a sequence file placed in DistributedCache Hadoop -
how can read sequence files distributed cache?
i have tried things, i'm getting filenotfoundexception.
i'm adding file distributed cache this
distributedcache.addcachefile(new uri(currentmedoids), conf); and reading in mapper's setup method
configuration conf = context.getconfiguration(); filesystem fs = filesystem.get(conf); path[] paths = distributedcache.getlocalcachefiles(conf); list<element> sketch = new arraylist<element>(); sequencefile.reader medoidsreader = new sequencefile.reader(fs, paths[0], conf); writable medoidkey = (writable) medoidsreader.getkeyclass().newinstance(); writable medoidvalue = (writable) medoidsreader.getvalueclass().newinstance(); while(medoidsreader.next(medoidkey, medoidvalue)){ elementwritable medoidwritable = (elementwritable)medoidvalue; sketch.add(medoidwritable.getelement()); }
it seems should have used getcachefiles(), returns uri[] instead of getlocalcachefiles(), returns path[].
now works after making change.
Comments
Post a Comment