Elvin Efendi's personal website


coursework: parsing and indexing Wikipedia articles

06 May 2014

In last winter semester I took Information Retrieval course in the university. During the course we have implemented several projects. One of them was to use Apache Lucene to index, rank and query Wikipedia articles. A small example of Wikipedia data can be found here. The application consists of four main modules:

Click to see the source code of the application

In the repository I’ve also put ready Lucene index to the folder called “small_lucene_index”. One can give path to this folder as an argument to the program and to search.

comments powered by Disqus