feat(cleanDocuments): preprocess documents, use stemming and stopword elimination for better accuracy