Hacker Newsnew | past | comments | ask | show | jobs | submit | MLnick's commentslogin

For me at least, sparse vector support means you can do elementwise operations (on the non-sparse elements) and in particular linear algebra like vector dot-products and matrix-vector multiply.


You may want to check out the Yahoo cloud serving benchmark, it is a pretty standard load-testing tool for this kind of thing. https://github.com/brianfrankcooper/YCSB/wiki


I have been thinking for a while now about the applications of bandits to financial markets (not so much the Q- and TD- learning approaches as I am a bit less familiar with them). I will definitely have a detailed look at the thesis, sounds very interesting!


+1 for Langford. He and many others (e.g. Deepak Agarwal) at Y! are among the most prolific publishers on this topic. Check out: http://hunch.net/~exploration_learning/ for a good, but pretty technical overview.


Also worth looking at for linear SVMs:

Sofia-ml which is a very fast linear svm and classification C++ package. Supports PEGASOS as well as logistic regression and also learning rankings. Has no bindings for other languages which is a bit of a downside. Still, a useful command-line tool.

http://code.google.com/p/sofia-ml/

It also includes a package for very fast mini-batch K-Means (http://code.google.com/p/sofia-ml/wiki/SofiaKMeans). Combining these two approaches one can effectively learn a "kernelized" model while still being linear and therefore very fast (at least this is the claim, I haven't tried this).

I've used both the SVM and k-means package and they work very well. For sparse datasets with >500 dimensions and > 10 million rows, file IO time was <15 sec, training time <3 sec. K-means is slower but still orders of magnitude faster than standard batch k-means.

Finally, Vowpal Wabbit is a very fast package that also uses stochastic gradient descent as the workhorse. Also has a nice feature-hashing compression scheme which is being widely adopted (e.g. in Mahout, and also in sofia-ml above).

https://github.com/JohnLangford/vowpal_wabbit/wiki


Congratulations! I was wondering why Bradford's posts were suddenly becoming so focused on the news and personalisation! :)


I'm curious as to what libraries are available for linear algebra and numerical computation (free / open source ones) and how they compare to e.g. Numpy, mpj or colt on java, etc?


I wonder what results a study into the differences in brain function (eg fMRI scans etc) between recalling normal memories and "fake" memories, perhaps using machine learning, might turn up? I wonder if such techniques could distinguish between real memories and fake repressed memories...

They've had success with "mind reading" already http://www.scientificamerican.com/article.cfm?id=the-mechani...


I'm pretty sure that there is no actual difference. Reconstruction is how memories work. To find the difference you have to look at the content of the memories.


Live in Rojan Club, Shanghai, China... And Liverpool University, those are all time faves!


I think Mark Shuttleworth's Thawte (bought by Verisign) is the only real example of a startup in the sense most on HN would think about it, at least at scale.

It's also worth noting he started Canonical which gave us Ubuntu.

But in South Africa at least, there are plenty of startups - again a much smaller scale than US or Europe - in software and Internet. Few people even in SA know about them, so I wouldn't expect pg to leap in to fund a bunch. Africa needs to develop it's own pg's. I actually think the approach of "micro-angel-seed-VC" is a good fit for the funding problem.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: