Showing posts from April, 2013

Simple statistical language detection

Given one sample news article in English, create a statistical analyzer in 30 minutes that is able to reliably recognize English text. Before we begin let’s consider that even though not everyone speaks French almost all of us are able to recognize it. The rules of a language are visible from very low level even up to the culture itself, so creating a simple language detector that deals with the lower levels of the language shouldn’t be too complex – well, at least in theory. The problem One of the projects I’m currently working on requires me to detect whether the user input is in English to make sure that the user generated content is placed in the right category. To my bad luck Google has shut down its language detection service so I had to find something on my own. After considering couple libraries I decided to investigate a bit: in theory, every language has very specific phonemes so I should be able to find the typical English sounds and match my incoming user conte

Interviewing bad practices

During the years I was looking for jobs or was looking for candidates I’ve seen and made very typical mistakes around hiring. To be honest, probably I covered all these during this learning process, so I collected the very common ones as a reminder. I hope this post –even if just a tiny bit- will help someone how not to interview candidates. Library knowledge Probably this is the most common one that candidates face during interview. Questions like “What is the difference between X class and Y class in Z framework” are totally meaningless. They can help you to see if the candidate worked some with the technology but apart from that every question has its exact answer on the first hit on Google – probably a Stackoverflow thread. It all does not matter – the question is: can the candidate solve a problem I throw at him/her? Library knowledge is not problem solving. Language knowledge Unless you try to find someone to start your new thing in a specific technology, focusing too

Auto generate help overlay for web and phone apps using SVG

Reading tutorials and help documents is not the favorite activity of any user so it's getting trendier to create semi transparent help overlays highlighting the basic features of a website or phone app - to speed up the learning process. While these overlays are usually hand drawn, they can be automated using SVG and JavaScript. As SVG allows us to draw Bezier curves, all we need to do is find a place for our help text, draw a nice curvy line from it to the target element and add an arrow head to the line, imitating as it was really pointing to that element. To create the arrow's head, it's practical to create a named marker then reference to it: <defs> <marker id="head" markerheight="4" markerwidth="2" orient="auto" refx="0.1" refy="2"> <path d="M0,0 V4 L2,2 Z" fill="black" id="headpoly"></path> </marker> </defs> When drawing a straight l

Compiling JavaScript code to detect errors early

Even though JavaScript is the most platform independent language we have today, and despite its flexibility, one thing hasn’t really been resolved: as an interpreted language it will hide any errors until runtime. As we are moving away from rich and compiled web client platforms like Java Applets, Flash, and Silverlight, we are filling the gap using complex JavaScript code and HTML5/CSS3 animations. That is all fine until the amount of JavaScript code in the project gets so big that it starts to be scary. Suddenly no one really wants to touch the codebase as even renaming a variable can be painful and error prone. The development slows down and the quality declines. When we generally talking about compiling the source to detect errors we are not really referring to the whole compilation process which would include: lexical analysis, preprocessing, parsing, semantic analysis, code generation, and code optimization . Most likely we are only interested in the first four steps, the