Posted on May 5, 2011

Machine Learning cheat sheet

For a recently taken course in Machine Learning, a substantial part involved learning and applying linear classifiers and clustering algorithms on smaller data sets. In order to summarise the most important material, I created a cheat sheet in LaTeX. I figured someone else might appreciate it as well, so why not make it available for more people than myself?

cheat sheet preview

.pdf can be downloaded here.
.tex-file is on Github here; feel free to modify or add information. Please let me know if you find mistakes!

Note that his document was really only created for my own study purposes, and hence might be of limited use for others. Hopefully not, though.

EDIT: Discussion on Hacker News: http://news.ycombinator.com/item?id=2515612

Posted on Apr 12, 2011

Implementing durability for in-memory databases, on SSDs

As the examination for a recently completed course in Database Systems Implementation, students had to implement a durable, high-throughput, in-memory key/value database for strings, coincidently the same problem as this year’s SIGMOD programming contest. I thought I’d present aspects of my own implementation of durability, focusing on problems I encountered and how I solved them. I also relate parts of my solution to existing NoSQL databases as well as how SSD disks can reduce the sophistication needed for a good-enough solution.

Perhaps the main problem of the programming contest lies in implementing the in-memory data structure itself, with issues such as concurrency control and efficient string comparisons. I will, however, assume such an arbitrary data structure, and focus this post on implementing durability.

Disclaimer
This is by no means efficient, beautiful or advanced, and ideas should probably not be copied without first dissecting them with experienced eyes. I learned as I proceeded, and I have essentially no prior knowledge of databases up until this course and assignment. This post is written for unexperienced enthusiasts like myself; for others there is probably little new.
Continue Reading

Posted on Oct 21, 2010

Dumb pipes — a (young) developer’s perspective

The openness of the internet is no doubt the major contributing factor for its success. Through the simplicity of dumb pipes, and with timeless principles such as the end-to-end principle and network neutrality, the internet grew and became what it is today: relevant for everyone, everywhere.

At least this is my opinion. Others, mainly telecom companies, argue that a more regulated and “smart” internet is necessary for its future expansion. Through “smart pipes” a better internet can be constructed, where the idea seems to be that intelligent wiring allows for more features and markets to be exploited, mainly by telecoms.

However, I believe there are important issues to be raised, and for relevant arguments to be heard, in favor for the pipes to remain dumb.

First and foremost, the fact that no smart pipes were needed to build today’s state-of-the-art should be a strong argument. On the other hand, the established internet is different today than it was during its build-up phase. The question is, then, whether the future expansion demands deep-packet inspection and other such technologies?

This question can only be answered with business cases detailing important end-user problems that would be solved, should these technologies be implemented. Examples where smart pipes are falsely claimed to solve problems are spam protection (already solved for end-customers; I haven’t seen spam in 5+ years), location (already solved by GPS, or location-databases such as Skyhook) or streaming (my YouTube works fine, thank you very much).

Regarding tiered pricing, service-specific connectivity or any such regulatory “innovation”, I’m afraid they will produce walled gardens that are, at best, only slightly cheaper for the end-customer — a discount not worth paying for.

In the future, three scenarios may emerge. Hopefully smart pipes will be used for solving some problem in need of real innovation. Alternatively, they will simply produce software-services by companies whose core-business is not software — harmless but useless. Finally, and most dangerously, they might contribute to a radical change in end-user connectivity. This scenario is frightening because there is nothing I, as a developer, dislike more than walled gardens.

The vast majority of the creative and profitable internet is the product of young programmers’ ideas, and therefore the industry’s continued growth is largely dependent on these people remaining innovative. Dumb openness is the way to go — because creativity does not thrive within walls.

Posted on Sep 14, 2010

That big magical goal

There is a fine concept called “fuck you money” (which proves that it comes from America) that essentially means a large enough amount of money that allows you to completely decide over your lifestyle. It’s enough so that you can say “fuck you” to anyone, anywhere, because you don’t need to reassure your wealth.

While those amounts of money are rather hard to make, there is a much finer achievement of equal or greater significance that anyone can reach. Call it the “fuck you goal”, or simply that big magical goal.

You know it by certain characteristics, and it’s completely personal; only you yourself know what it might be. When you reach it, though, it will define you from there on — because you will have proven something to yourself that no one can take away from you. You will then, only then, realize your full capacity and further potential.

Reaching that goal is, however, really really hard. Today’s modern society, especially in Sweden with all its social safety nets, has almost completely removed the requirement for effort needed to accomplish anything. There are government agencies who find jobs for you. You apply to university through at most three clicks — no letters of recommendation needed, no interviews or even a letter of your own. And you most definitely don’t need to kill your own food each day.

While fantastic from a welfare state point of view, the complete removal of effort from everyday life and the simplicity of survival leaves many people disillusioned and merely half alive. When you don’t have to do much but still have everything you need to go about your day, you never really get to know your human capacity.

Philippe Petit, known as the “Man on Wire”, says it best:

“I am not advocating to live in danger, but at the same time, to force birds to carry a leash is to kill the idea of what a bird is.”

That big magical goal is exactly what will remind you of what you can do. It will create an inner calm, with an added assurance that whatever you do thereafter doesn’t really matter. Not so much because your achievement can’t be topped (on the contrary — it will enable you to go much farther), but because you already hit the upper plateau. Once you’ve reached it, it takes huge effort to fall back down again. And once you’re up there, you no longer have the stress of proving yourself to anybody, because you did something of real meaning — to yourself. That’s worth more than fuck you money.

Posted on Jan 31, 2010

White space time management

The naïve approach to graphic design is usually to include as much information as possible on a given canvas. White space is nothing more than unused space, ready to be filled with more graphics and copy. However, this is obviously a bad strategy since all it does is confuse the viewer and obscure the message. This is why Google became king of search, and why Apple keeps being awesome.

Similarly, the naïve time management strategy is to fit in as much work and meetings in a schedule as possible. The person with the busiest calendar is clearly very good at managing his time and responsibilities. Or is he?

There is a difference between busy following a schedule, and busy solving a problem. This is essentially the same as a design busy presenting information and a design conveying a message.

Strangely, the concept of simplicity is never the natural state but always the result of carefully considered choices. Just as a design process should be about removing clutter until the bare essence is left, time management should be about removing appointments. Not adding new ones where they fit.

White space is a powerful element, necessary for creating dynamics between the essential components of a design. In the same way, free time is essential for effective time management. Leisure time works in your favor when alternated with creative problem solving. Mindlessly adding more work is thus nothing else than adding clutter.

True productiveness is just as much a product of free time, as of hard work. This is why the most creative people always tend to have hobbies, read books, write and travel — while the mediocre majority complain about being busy.

Posted on Nov 30, 2009

The Game 3.0

Yet another game is now complete and over. I’d have to say this was one of the better ones we’ve made! It’s always very difficult to judge the quality and difficulty of tasks beforehand, but based on player response the conclusion is that it was successful. I thought I’d share some of the problems that were in the game. For more about the game me and Fredrik make, see The Game and The Game 2.0.
Continue Reading

Posted on Nov 19, 2009

The Game 2.0

Last week I went back to the UK with Fredrik, for another game. There turned out to be less participants, but we did our best to create some interesting problems anyway. Here are two of them.

Stage 1, problem 6 — Retrieve me treasure…

…or reel th’ plank!
map
Clues:
Yer booty is t’ be found on “Indoor Island”.
Me treasure is me weap’n!

The key to solve this task is to realize that the map does not depict some South Sea island, but instead a location at the venue where the game took place. To help finding the right place, some clues were given: the compass shows the relative orientation and “Indoor Island” indicates that it’s not outside. Through the event’s website, this seating map for one particular floor can be found:

The basic characteristics are similar, and the actual interior design corresponds to objects in the map. For example, “West Port” and “East Pier” indicate entrances, and “Underground Gorge” is the escalator from the floor below. If the participants went to the “X“, they found a poem containing a reference to the password: the poor pirate’s gold plated dagger.

Stage 2, problem 5 — Jazz

U+100AA

o hrhi ng odvm xutz grfwor nideiyy jiuz’x oxrivhi vm colxh phse, rzh cxxhrbw ux to cartxx crdiq. bl tymx jty tigi ux sujf lnok fvxx gagt yq lnw rojf xux ulu ieef coixh, ctod r tmta vrzoi shx lzhmaz zof xsaz cikt e fbtgcq hexgm. yq qhlz hrhi yhukvp yc tz ae grstsicuee lqy ktvbnmh wdmtazeeurt ekamqw ngj syuzrkkd re lr yuuep autz a xdsgxyqlq xubtg r dsfx os rzh uhc rri xux yuexmtaz wre ycht tyq wptxcvxc pkkakqh tkgsj. m rrp coixh, ztzeiuey potyayg ukies vrtr, wyqvr iuoi slblzs, sdinmnies hexgmj xmxx gii, pvvyzeu rsemaikayfee asayg…eokv flnm gsyqr, stttrexvv lixgvr zriuurt muwrdh ubs tydshzn tyq ezhxpyayf mxeve.

The easiest method to solve this task was to identify the string “U+100AA” as a Unicode code for the Linear B ideogram “Garment.” By intuition or by evaluating alternative encryption methods, the larger paragraph of text was found to be a Vigénere cipher. “Garment” was thus the keyword. If deciphered, and Googled for, the text could be identified as an excerpt from Fitzgerald’s The Great Gatsby. The really literate participants would, of course, recognize the text immediately and also that “Jazz” refers to The Jazz Age — the period in American history in which The Great Gatsby took place.

All in all it was lots of fun creating and interacting with the competing teams! Next game is in only 2 weeks, something I’m greatly looking forward to.

Posted on Oct 8, 2009

An idea denied

Why are successful people turned into instruments of unique opportunity? Why are their stories told as circumstances of chance?

There are many examples of this kind of mystification. Isaac Newton suddenly discovered gravity when he observed an apple fall from a tree. A composer found inspiration and then wrote an amazing symphony. An entrepreneur had a brilliant idea and made a fortune through his business.

In countless of stories such as these, the triggering factor to a significant event is always made up to be external, and thus out of reach for the person’s influence.

But this is not reality: authors write nine to five regardless of any “inspiration”, they rewrite the same passage countless of times and they most often produce their best work when they are old. Newton didn’t discover gravity due to some falling apple; he had studied physics, mathematics and astronomy more than most people ever did.

I dare say that the real factor to success is always, and has always been, damn hard work and many failed attempts.

This is, however, a dangerous idea. People go to extreme lengths to deny it. For example the woman playing the lottery, believing it is the only way for her to become rich. Or the failed author blaming his empty pages on not having found the right inspiration — “yet.”

There is really no excuse for not having the capacity to achieve one’s own success. No talent? Talent is not nearly as important as the time you spend with a task, as described by Malcolm Gladwell in his Outliers (in which he also describes the importance of opportunity). No money? University is free, study loans are accessible for anyone and it doesn’t cost anything to start a company. Not “ready”? Get over it.

Why is the idea that anyone can become anything through one’s own ambition not embraced, but instead denied? Because it means complete responsibility for oneself, and the unlimited possibilities are easier shrugged off than capitalized. By some reason people are much more likely to confine themselves to a state where they believe they can’t affect their own destiny, rather than to embrace freedom. But still everyone knows that they could do anything, if they only tried. The guilt reminds them of this.

This is why successful people are mystified. By admitting that the causes for success were indeed equally accessible for everybody, the guilt is unbearable. So the choice is simple — deny the possibilities, stop thinking about it and blame everything on lucky circumstances.

Posted on Sep 30, 2009

Do stuff

Here’s a graph illustrating a model of the relationship between doing a lot of stuff, and the amount of fun it results in:

Productivity graph

With stuff I refer to anything productive and rewarding, such as a course at school, a qualified job or a project of your own. And with fun I refer to the feeling of satisfaction and purpose that is the result of doing meaningful things.

So, what can this model tell us?

  • Do more, create something, engage in productiveness and a great feeling will follow.
  • Eliminate wasted time to give room for more personal projects, sports or arts.
  • Being overworked removes all the joy from what you’re doing.
  • Make sure to find out your personal “maximum workload constant”, to know the feeling of when there’s simply too many things going on. You’ll never want to end up getting burned again.
  • Remember where your limit is, and carefully balance your workload to stay just below the threshold.
  • The “maximum workload constant” is not constant: it can be extended to allow for an increased capacity.
  • Having too little to do is far better than being overworked.
  • When you have “almost too much to do” it’s really just the right amount of work!
  • If you’re engaged in stuff you like, and you are filling your time with it, you’ll hopefully experience flow.

Or as the Ruby hacker _why said:

when you don’t create things, you become defined by your tastes rather than ability. your tastes only narrow & exclude people. so create.

After all, it’s a thousand times more interesting to talk to someone that fills his time with interesting work and projects of his own, rather than someone completely defined by his music taste or belief. Experiences come through interaction with the real world, and they don’t create themselves — they need to be obtained through hard work. And a few leaps of faith.

Posted on Aug 13, 2009

The Game

Twice a year me and Fredrik create The Dreamhack Game (DHG), at the Dreamhack computer festival. Earlier this summer we got an email from the Multiplay staff, who arrange UK’s largest LAN parties, inviting us over to create what is now the i-Hunt. Apparently they knew of what we do in Sweden, and liked it enough to fly us over to their own event. Quite cool indeed, and of course we made the most out of it. I’d say my first international “business” trip was a success.

The game is advertised as a “contest of intellect, lateral thinking and logical skill”. No special skills or knowledge are required, only the ability to figure out what to do and how to obtain necessary and relevant information. The tasks usually include elements of code breaking, alternate reality gaming, geocaching, deciphering, treasure hunting and various puzzles. Advanced Google skills are fundamental to solving the game, and so are endurance and thoroughness. Once a problem is solved, you move on to the next level. You usually compete in teams, but you can never know the current position of your competitors. This makes it a competition of intelligence cloaked in mystery and with a touch of psycological warfare.

There were a few tasks we made for this event that I’m a little extra proud of. I’ll explain them here, as they give a very good picture of what The Game is really about. But if the reader feels like giving the game – and these problems in particular – a try, then head over to the i-Hunt website (which will be up until November 09), and register to play. An answer sheet is also available if you get stuck and just want to try the next problem.

Stage 1, problem 5 — The Shameful One

An outbound coast
Surrounds or embraces? The city
Hamilton’s (NZ) antipode.
The Shameful One

There are two clues here – the haiku poem and the maze image. Both can give the answer on their own, but also in combination with another. The line that is traced through the maze when solved, can be identified by the hunter as the south coast of Spain, and part of Portugal. This is what is referred to in the poem as “An outbound coast”. Furthermore, the location of the square dot indicates “The city, Hamilton’s (NZ) antipode” – which is the city of Córdoba in Spain. An antipode is a complete opposite geographical location, a rare property that Córdoba and Hamilton in New Zealand share. Córdoba is thus the password.

Stage 2, problem 3 — Who’s coming to visit today?

Have you been paying attention to the local tv-station?

In this problem, the contestants had to realize that “the local tv-station” referred to the event-specific daily Youtube broadcasts, mainly the Saturday one found here. The careful watcher will notice the announcement of a new sponsor – Tentacle Technology. However, this company is not to be found on the internet, nor did they actually show up at the venue. Instead, if the URL http://tentacletechnology.com was thought of and followed, a really fancy website was found. The password “puppet” could be found if downloading the latest press release.

Smell something fishy here? That’s because me and Fredrik made it all up; in two hours we had set up the company website, got fake sponsorship deals, marketed ourselves, staged a 31 year old corporate history, stolen product descriptions from IBM and even put together a catchy mission statement. This kind of ARG-inspired problem is one of my favourites.

Stage 2, problem 5 — Think inside the box

Cards

In this last problem, the first realization to make is that the cards are actually not part of some unspecified card game. Instead it’s a sudoku, and when solved the numbers revealed in order are 174143214192. For a hunter, this is immediately identified as the IP address 174.143.214.192, and one of the first things to do with an IP is to HTTP it. When done so, a simple website containing the following image was found:

ihunt

Again, the experienced contenstant would identified the erect and fallen cans as morse code. When deciphered, the final password was “wey“.

* * *

At Dreamhack we attract ~600 players, and at the i-Hunt we managed to get 200 registered players which should be considered good, since it was the first game in the UK for us. However, it looks like I’ll be going back there soon, as I’ll probably be arranging the i-Hunt three times a year in total – which feels great! I’m looking forward to get to know Britain more, as well as to work with creating and further evolving The Game.

Posted on Aug 4, 2009

Art repeats itself

I spent this last Saturday at the Louisiana Museum of Contemporary Art, in Humlebæk, Denmark. It has to be one of my favourite places for those one-day excursions, including this time with the very interesting and well-presented current exhibition of “Green Architecture for the Future”. Apart from looking at society’s current sustainability trend with new eyes, I once again started pondering over what art really is, or what it’s supposed to be.

The way I see it, I “understand” art with the help of three axioms. One claims that everything evolves and has to evolve, the second holds as truth that the human eye is naturally attracted to beauty while the last says that humans are not attracted to something similar to what they have seen before. Also I take for granted that there is nothing more beautiful than something “natural”, be it a human face or a landscape.

Taking off from the second axiom of beauty, I believe that the first forms of art all tried to convey something aesthetic. An effort as big as one could amount to was made to create artifacts of beauty: artists strived to encapsulate what the human eye was attracted to.

But when I visited Rome, and saw the perfected works of Michelangelo and Bernini, I realized that “what other offspring of nature is left to depict as perfectly as this?” Somehow it felt as if art – that imitated nature – simply could not be rendered much better than what was achieved back then in the 16th and 17th century. With regard to my first axiom – what were artists supposed to do? Striving a life-time to master techniques already perfected would contradict my last axiom – they would only produce something already seen.

My own theory is thus that the modern art that can be seen today (for example at Louisiana), which ranges from the simplest dots and lines to wierd and unique styles, is the natural next step ahead from when realistic and beautiful art was perfected. New styles that had never been seen had to be invented, new emotions had to be triggered. When yesterday’s art was a contest of talent and beauty, today it’s more about eccentricity and provocation.

The same trend can be observed within digital art. When software and hardware began to allow an artist to be creative in front of the computer screen, the first forms of benchmarks for “great” digital art were realistic depictions of reality. An example is the rise of as-realistic-as-possible special effects in movies. Some years ago this was achieved, and following the same evolution as “real” art the next step was to find a more “unique” style. Here a proper example are the human characters in a Pixar movie. While Pixar could very well create the most realistic humans for their movies, they still choose to make them look “cartoony” in their own sense.

Personally I’m hoping for a renaissance of the talented artist who produce true beauty. While much of the modern art is really cool and impressive, nothing beats a perfect – but unique! – depiction of nature.

Posted on Jul 9, 2009

So, I got a blog. Again.

I think this has to be my fifth blog. Hopefully it will be the last in a long chain of unsuccessful attempts to fit in the “blogosphere”. Because yes, dear future employer, I know you want me to have blog.