Devised a search engine to search through 15000 news webpages. PageRank and TF-IDF weighting have been used to rank different search results retrieved from Solr, having features such as spell checking, auto complete and snippets.
Technologies Used:Java, PHP, Javascript, Python, jQuery, Apache Solr, Lucene, jSoup
Using a web server, I created a web page with a text box which a user can retrieve and then enter a query. The user’s query will be processed by a program at the web server which formats the query and sends it to Solr. Solr will process the query and return some results in JSON format. A program on the web server will re-format the results and present them to the user as any search engine would do. Results are clickable (i.e. open the actual web page on the internet).
Technologies Used:PHP, NetworkX library, Solr, Lucene, PageRank
Creating an Inverted Index of words occurring in a set of web pages and hands-on experience in GCP App Engine using MapReduce.
This exercise is about comparing the search results from Google versus Bing, the two leading US search engines. Many search engine comparison studies have been done. All of them use samples of data, some small and some large, so no general conclusions can be drawn. But it is always instructive to see how the two search engines match up, even on a small data set. I followed the process of issuing a set of queries and to evaluate the returned results for relevance. These studies do not seek to answer the ultimate question of which search engine is “best”.