From 3363716124c76e8b478e823e89661e7cf8454873 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Carlos=20Cuevas?= Date: Fri, 3 Jul 2015 10:54:24 +0200 Subject: [PATCH] Pushing a new README.md --- README.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/README.md b/README.md index e69de29..cf5059c 100644 --- a/README.md +++ b/README.md @@ -0,0 +1,15 @@ +# TfIdf Crawler in Go + +A simple website crawler and tfidf parsing engine to implement a simple search engine. + +## Command Line Parameters + +This utility relies in a couple command line arguments. + +* -w: Allows to choose the file where the webs to index will be found. If not present, it assumes a *websites.txt* file in the same directory the command is run in. +* -s: Chooses a stopword language. It assumes *english1* as default. See the stopword files for choices. + +## Known bugs + +* Stopwords should be loaded from the lib directory or implement a path search. +* Probably I messed up some of the tfidf calculations.