python - Retrieving sentence strings from NLTK corpus -


this dataset:

emma=gutenberg.sents('austen-emma.txt') 

it gives me sentences

[[u'she',u'was',u'happy',[u'it',u'was',u'her',u'own',u'good']] 

but want get:

['she happy','it own good'] 

you appear getting correct output, according nltk docs:

sents(fileids=none)[source]¶ returns: given file(s) list of sentences or utterances, each encoded list of word strings.

so need turn list of word strings space-separated sentence:

sentences = [" ".join(list_of_words) list_of_words in emma]


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -