Experimental Haskell machine learning library
Go to file
2016-08-21 01:02:45 +04:30
app feat(crossEntropy): crossEntropy cost function 2016-07-24 10:48:04 +04:30
examples chore(data): doc-classifier-data 2016-08-21 01:02:45 +04:30
profiling fix(naivebayes): fix the algorithm to make it actually work 2016-08-05 23:54:36 +04:30
src feat(topten): top-ten classification with evenly distrubuted data 2016-08-21 00:59:42 +04:30
test fix(stack): use stack build and exec instead of manual stack ghc 2016-07-18 16:33:34 +04:30
.gitignore Initial commit 2016-07-17 16:52:16 +04:30
.gitmodules feat(results): accuracy, recall and precision functions used to calculate measures 2016-07-29 17:55:59 +04:30
LICENSE fix(stack): use stack build and exec instead of manual stack ghc 2016-07-18 16:33:34 +04:30
README.md feat(topten): top-ten classification with evenly distrubuted data 2016-08-21 00:59:42 +04:30
Setup.hs initial commit, still work in progress 2016-07-17 16:53:13 +04:30
sibe.cabal fix(naivebayes): fix the algorithm to make it actually work 2016-08-05 23:54:36 +04:30
stack.yaml fix(naivebayes): fix the algorithm to make it actually work 2016-08-05 23:54:36 +04:30

sibe

A simple Machine Learning library.

A simple neural network:

module Main where
  import Sibe
  import Numeric.LinearAlgebra
  import Data.List

  main = do
    let learning_rate = 0.5
        (iterations, epochs) = (2, 1000)
        a = (logistic, logistic') -- activation function and the derivative
        rnetwork = randomNetwork 0 2 [(8, a)] (1, a) -- two inputs, 8 nodes in a single hidden layer, 1 output

        inputs = [vector [0, 1], vector [1, 0], vector [1, 1], vector [0, 0]] -- training dataset
        labels = [vector [1], vector [1], vector [0], vector [0]] -- training labels

        -- initial cost using crossEntropy method
        initial_cost = zipWith crossEntropy (map (`forward` rnetwork) inputs) labels

        -- train the network
        network = session inputs rnetwork labels learning_rate (iterations, epochs)

        -- run inputs through the trained network
        -- note: here we are using the examples in the training dataset to test the network,
        --       this is here just to demonstrate the way the library works, you should not do this
        results = map (`forward` network) inputs

        -- compute the new cost
        cost = zipWith crossEntropy (map (`forward` network) inputs) labels

See other examples:

# Simplest case of a neural network
stack exec example-xor

# Naive Bayes document classifier, using Reuters dataset, achieves ~62% accuracy
# using Porter stemming, stopword elimination and a few custom techniques.
# The dataset is imbalanced which causes the classifier to be biased towards some classes (earn, acq, ...)
# to workaround the imbalanced dataset problem, there is a --top-ten option which classifies only top 10 popular
# classes, with evenly split datasets (100 for each)
# N-Grams don't seem to help us much here (or maybe my implementation is wrong!), using bigrams increases
# accuracy, while decreasing F-Measure slightly.
stack exec example-naivebayes-doc-classifier -- --verbose
stack exec example-naivebayes-doc-classifier -- --verbose --top-ten