postgresql - Postgres - best design for recording up to 8000 ints? -
newbie postgres here - have corpus of 8000+ documents (each labelled number/index), , i'm trying create table in each row single word , record index of each document in mentioned. primary goal able list of document ids each word mentioned in.
i've thought of using array, or long binary string, none of seem great solutions. suggested me make table each term, considering i'd have quite few terms, feel that'd rather memory intensive also.
what's best way, if any, accomplish this?
based on information provided, i'd looking @ following schema start with:
words: id integer type, auto-increment, primary key, indexed word varchar(something), unique constraint, indexed documents: id integer type, auto-increment, primary key, indexed details whatever needed words_in_document: word_id integer type, foreign key words(id) doc_id integer type, foreign key documents(id) primary key on word_id, doc_id possible second index on doc_id, word_id
that's standard way of efficiently representing many-to-many relationship.
i wouldn't too concerned 8,000 documents or millions of words, that's databases built for. in shops, 8,000 rows considered small configuration table.
with schema, finding list of documents containing given word like:
select d.details documents d, words w, words_in_document wid w.word = 'desired_word' , w.id = wid.word_id , d.id = wid.doc_id
(that's implicit join syntax, there's explicit join variant well).
Comments
Post a Comment