Latest Pictures

Posted on May 5, 2011

  • 22 Comments

Machine Learning cheat sheet

For a recently taken course in Machine Learning, a substantial part involved learning and applying linear classifiers and clustering algorithms on smaller data sets. In order to summarise the most important material, I created a cheat sheet in LaTeX. I figured someone else might appreciate it as well, so why not make it available for more people than myself?

cheat sheet preview

.pdf can be downloaded here.
.tex-file is on Github here; feel free to modify or add information. Please let me know if you find mistakes!

Note that his document was really only created for my own study purposes, and hence might be of limited use for others. Hopefully not, though.

EDIT: Discussion on Hacker News: http://news.ycombinator.com/item?id=2515612

Posted on Apr 12, 2011

  • 0 Comments

Implementing durability for in-memory databases, on SSDs

As the examination for a recently completed course in Database Systems Implementation, students had to implement a durable, high-throughput, in-memory key/value database for strings, coincidently the same problem as this year’s SIGMOD programming contest. I thought I’d present aspects of my own implementation of durability, focusing on problems I encountered and how I solved them. I also relate parts of my solution to existing NoSQL databases as well as how SSD disks can reduce the sophistication needed for a good-enough solution.

Perhaps the main problem of the programming contest lies in implementing the in-memory data structure itself, with issues such as concurrency control and efficient string comparisons. I will, however, assume such an arbitrary data structure, and focus this post on implementing durability.

Disclaimer
This is by no means efficient, beautiful or advanced, and ideas should probably not be copied without first dissecting them with experienced eyes. I learned as I proceeded, and I have essentially no prior knowledge of databases up until this course and assignment. This post is written for unexperienced enthusiasts like myself; for others there is probably little new.
Continue Reading

Latest Tweets