chore(README): explain how the top 10 method increases accuracy and F measure

2016-08-21 01:21:42 +04:30
parent 7d0ce29ba8
commit ace0a18653
1 changed files with 1 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -43,7 +43,7 @@ stack exec example-xor
 # using Porter stemming, stopword elimination and a few custom techniques.
 # The dataset is imbalanced which causes the classifier to be biased towards some classes (earn, acq, ...)
 # to workaround the imbalanced dataset problem, there is a --top-ten option which classifies only top 10 popular
-# classes, with evenly split datasets (100 for each)
+# classes, with evenly split datasets (100 for each), this increases F Measure significantly, along with ~10% of improved accuracy
 # N-Grams don't seem to help us much here (or maybe my implementation is wrong!), using bigrams increases
 # accuracy, while decreasing F-Measure slightly.
 stack exec example-naivebayes-doc-classifier -- --verbose