java - Lucene Search with MultiThreads -
i need execute in machine learning project >3000 queries in index 3gb size.
in order speed performance create 4 threads(i got 4 cores in macbook pro) , gave each 1 part of total queries(if got n queries in total each thread got n/4 queries).
i open index via fsdirectory.open(file)
, share indexsearcher
threads.
the problem don't see performance improvement neither cpu increase. played different number of threads still no change.
to save whole index in ram not possible!
i saw on other threads solution open index read use lucene 4.3 write options removed reader no worries read mode anymore!
i aware this page , tips give looks quiet out of date.
so question how can parallel index search in order improve performance real lucene?
below example code using:
list<string> querylist = new arraylist<string>(); list<thread> threads = new arraylist<thread>(); for(int i=0;i<number_threads;i++){ list<string> querysublist = querylist.sublist(fromindex, toindex); queryparser ngramindexqueryparser = new queryparser(version.lucene_43, "ngram", new keywordanalyzer()); startworker(querysublist, threads, date, ngramindexqueryparser, ngramsearcher); } public static void startworker(list<string> querysublist, list<thread> threads,queryparser ngramindexqueryparser,indexsearcher ngramsearcher){ ngramindexsearch task = new ngramindexsearch(querylist, ngramindexqueryparser, ngramsearcher); thread worker = new thread(task); worker.start(); threads.add(worker); } public class ngramindexsearch implements runnable { public ngramindexsearch(list<string> querylist, string year,queryparser queryparser, indexsearcher searcher){ //initialization } public void run() { for(string q:querylist){ query query = queryparser.parse(querytext); topdocs topdocs = searcher.search(query, nrofdocstoreturn); } }
Comments
Post a Comment