1. Use what correlates best with the outcomes. Look in to feature selection and ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

agibsonccc on Jan 29, 2014 | parent | context | favorite | on: Mistakes Programmers Make when Starting in Machine...

1. Use what correlates best with the outcomes. Look in to feature selection and principal component analysis for this. This will cause less noise due to smaller feature vectors. It also allows more digestable outcomes. I would also highly reccomend visualization. Weka is great if you want plug and play; otherwise there's the more traditional R/matlab. It really depends on what you're comfortable with.

2 . Depends what kind of learning you're doing. I would look in to multinomial logistic regression for most applications (more than one class) for supervised classification. Then there's also k means if you're looking to understand trends in your data. Keep in mind this is my off the shelf/simple recommendation.

I would love input on a plug and play machine learning CLI. I planned on building out my current project in to a full blown command line app. Since it can handle most features including automatic visualization/debugging via matplolib I figure with some documentation it might be a neat tool for people who don't want to deal with feature selection but still want things simple. It's definitely a problem that there's really no clear way to build simple models. Domain knowledge is also an expensive problem.

metrix on Jan 29, 2014 [–]

Do you have it on a website or github? I would be interested in taking a look at it.

agibsonccc on Jan 29, 2014 | [–]

https://github.com/agibsonccc/java-deeplearning/

Keep in mind documentation is one of the things I need to work on the most now. I have it built and ready to go for the most part.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact