St. on IT: January 2010

Sunday, January 31, 2010

Algorithm Complexity Analysis On An Interview

I was digging around trying to find what kind of questions Google interviewers ask potential candidates and found some guy story who was asked the following question:

"What is the complexity of searching for existence of all characters from one string into another?"

That guy answered "n*n", and after being rejected was wondering if it was "n*lg(n)". Poor little thing...

IMHO: Rule #1 when you're having an interview (I guess, especially at Google): "never trust the interviewer", it is their job and trade to mislead you and check how you actually think, rather if you know the correct answer.

So, here is how I would answer to that question. (yeah, I'm showing off, but please consider that as me simply exercising)

Okay. First of all the complexity vastly depends on an implementation. I can see at least three of them.

1. A blunt force solution, you just go through every character from the pattern string (p) and check if each of them exist in the text string (t). Strings are essentially arrays and search of any character in them have linear complexity. So that if we have n = |p| and m = |t|, then the overall complexity of the algorithm will be

O(n * m)

2. There might be another solution. If we pre sort the t string with say quicksort algorithm, then we can use the binary search, which has logarithmic complexity and faster than a blunt cell-by-cell search. In this case, considering that the quick-sort complexity is super-linear, the overall complexity will look about like that:

O(m * lg(m) + n * lg(m))

3. One more version. Considering that both strings consist of the same limited number of symbols, we can create two hash-tables, one will contain unique characters from the p string and another one will have characters from string t. As the alphabet is a limited list, finding of the intersection of those two tables will have a constant complexity. Plus we should add the complexity of building the hash-tables, which will be equal to the lengths of the strings. And the overall complexity will be

O(n + m + C) = O(n + m)

Roughly speaking, we can take a look at the case when n = m, then we can say the following about those three algorithms:

1. O(n*n) - quadratic complexity
2. O(n*lg(n)) - super-linear complexity
3. O(n) - linear complexity

I suppose the last one is the winnar.

--

This is about it.

I still do have that feeling that I'm gonna get my ass kicked in the upcoming interview, but at least I won't give up that easily and will throw couple of punches back.

Monday, January 25, 2010

How to Move Folders Between Git Repositories

Today I needed to move some folders from one git repository to another preserving the history. Evidently it's a tricky business, so here is how do that.

First of all you need to have a clean clone of the source repository so we didn't screw the things up.

git clone git://server.com/my-repo1.git

After that you need to do some preparations on the source repository, nuking all the entries except the folder you need to move. Use the following command

git filter-branch --subdirectory-filter your_dir -- -- all

This will nuke all the other entries and their history, creating a clean git repository that contains only data and history from the directory you need. If you need to move several folders, you have to collect them in a single directory using the git mv command.

You also might need to move all your content into some directory so it didn't conflict with the new repository when you merge it. Use commands like that

mkdir new_directory/

git mv my_stuff new_directory/

Once you've done commit your changes, but don't push!

git commit -m "Collected the data I need to move"

This is all about the source repository preparations.

Now go to your destination repository

cd ../my-repo2/

And here is the trick. You need to connect your source repository as a remote using a local reference.

git remote add repo1 ../my-repo1/

After that simply fetch the remote source, create a branch and merge it with the destination repository in usual way

git fetch repo1
git branch repo1 remotes/repo1/master

git merge repo1

This is pretty much it, all your code and history were moved from one repository to another. All you need is to clean up a bit and push the changes to the server

git remote rm repo1
git branch -d repo1

git push origin master

That's all. After that you can nuke the temporary source repository.

I'm looking for a job

I'm a seasoned web-developer with almost 9 years of professional experience, looking for a position of Ruby developer/software designer.

I'm an early Rails adopter, started to work with it at the very first versions more than three years ago. Since then worked with Rails on a wide range of applications, blogging, social networks, e-commerce, real estate. I also have experience working with Merb and Sinatra, have several rails plugins and a non-rails application of my own. Before Rails I worked with Python, Groovy, did some Java and quite a lot of PHP (object oriented, no crap)

My specialty is multi-lingual, hi-load and ajax applications. I also pretty good in models, databases and workflow design. And certainly I can do some serious JavaScript work.

I will prefer a permanent position and ready to relocate (have experience working overseas), but also can work as an independent contractor/consultant, including telecommuting (have freelance experience).

If you're interested, please check out my CV, it has my contacts in it.

Sunday, January 24, 2010

Terminal Fun, or Why RightJS?

In this article I'd like to share some backstories and ideas that lead me to the development of RightJS. There are some generic texts on the official site, but in this article I'm going to get into some architectural details, thoughts and philosophy that were used as the basis in the development.

It might start to sound a bit philosophical and like RightJS saves the world, so please, be patient, I'm sure you can find some rational ideas behind.

Disclaimer

Despite the fact that I did quite a lot of JavaScript work lately, I'm not really any sort of JavaScript guru. I mostly work as a software developer/designer on the server side and my weapon of choice is Rails.

What I do with RightJS is a mere attempt to build a tool that will fit Ruby/Python developer habits and will be fun to work with.

Philosophy

That might be an arguable point of view, but to me the world is always trying to find a balance. The third low of mechanics says: "every action causes an equal and opposite reaction", and for this reason opposite forces always find a moment of equilibrium, moment where things are balanced.

For example when a body falls though air, the forces of gravity and air friction eventually gets equal and the body stops gaining speed; reaching its terminal velocity. A simple fact, but those two opposite forces combined make things like flight possible.

You might extrapolate and abstract this principle a bit further and take another two opposite things: "fun" and "safety". Abstractly speaking, I guess it is obvious that the more there is safety the less fun it is, and the more fun it is the more it gets dangerous.

Those two follow the same exact principles. And as much it might be honorable and cool to be at one side, with rebels or with the enterprise; the truth and all the magic things always lay somewhere in the middle. And that's what we are trying to do in here.

With RightJS I'm trying to reach that moment of terminal fun.

Back to The Reality

When it comes to JavaScript and safety I'm sure most of developers will think jQuery. On the other side of the scale probably will be Prototype. And I'm sure everyone who ever worked with JavaScript recently, heard lots of talks about the jQuery safe-mode, and how non conflicting it is, and how it lets you easily write plugins and so one.

But the truth is, that wast majority of the people who does those talks, never did write any plugins, never got in really non-solvable conflicting situations, and as the matter of fact they don't know anything but jQuery. They just heard that prototype extensions are dirty, and probably think that hating them makes them look tougher :).

Meanwhile, people were using Prototype for years, there are hundreds of plugins. And yes, big corporations did and still do use it. The reason why people started to switch from Prototype is because it got too fat and slow, not because it is unsafe.

It kinda reminds me the situation with airports. Someone, somewhere did created that shoe-bomb, and now we all have to stand in queues at security checks, take off our shoes and have a walk on a dirty floor.

Is it safer this way? Indeed it is. But if you are not a bad guy, would you like to skip that procedure?

I bet you do.

Between Safety and Fun

So, what about RightJS and that terminal fun thing?

Yes, RightJS does extend prototypes and we do have several global scope units and functions. That's true. But we don't do that blindly creating a crazy mess.

Most of the native unit extensions simply bring methods that already exist in later specifications. Things like String#trim, Array#map, Object.keys, Function#bind. Yes we do add them to the prototype level. But they all behave strictly in the standard way and they _will_ eventually appear in browsers.

I agreed that there might be some desperate people who had a complicated childhood and might implement a non-standard Array#each method and screw the things up. But IMHO, we should let the people learn on their mistakes and that's certainly not the reason to get scared and hide behind some "safe" interfaces. If someone wants he can screw the safe interface just the same way.

On the other side, adding those methods, will give you several advantages, like first of all you can use all those methods right now, without waiting while obsolete browsers will vanish from the face of the Internet. Secondly it will actually prolong your application life, because those methods only going to get into common use, which means your code will be mostly working in future even when RightJS disappear.

Then we do have several other top-level functions and objects. Functions like isString, isArray, isObject, $A they are extremely handy in everyday work and they are real time savers. Those methods all have clean, standard names and behavior, and you need to be a crazy person who don't know what he does, to re-implement them for some radically different purpose.

Yes we do unsafe things with RightJS, but all of them are strictly managed, organized and helpful. This way we balance between total safety and fun/comfort in work.

Deeper Down The Rabbit Hole

Any security comes with its cost, which is usually freedom and flexibility. Yes, I do agreed that having a non-conflicting interface is safe, but on the other side, it hooks your whole application on that interface. And I'm not talking about ridiculous situations, when you use the safe interface only and there are no potential threads for your application. Yes you are safe, but you're hooked to that interface.

We are living in a pretty dynamic environment, and things do get faster and faster. New superior solutions appear every year. So consider this.

Most of Prototype/Mootools code will work on RightJS after trivial and easily programmatically processable fixes. Same for RightJS, because it sticks to the standards in method names and behaviors, it will be really easy to transform RightJS code into Prototype/Mootools, pure DOM or even jQuery if you will.

In case of a safe interface on the other hand, you don't have to change anything and you won't. 99.9% of applications will stuck with it, continuing to drag it around creating a mess of frameworks and interfaces inside of a single application. Take for example Dojo. An excellent framework, but most of applications are trapped with it, because it is too expensive to migrate.

Okay, tomorrow might be a long shot and many applications don't live that long. What about today? There are also things to think about.

RightJS unlike jQuery is not just a DOM-wrapper, it is an actual framework. RightJS does not leaves you alone with pure DOM/JavaScript when you disagree with what's going on. It supports multiple paradigms of development, OOP, FP, procedural style and at any moment you're free to choose how to do your job.

Then, in similarity with languages like Ruby or Python, the whole structure of the framework is open. Unlike solutions with a safe interface RightJS doesn't hide anything inside of closed functions with spaghetti code. Most of the methods are organized in proper OOP structures and can be easily transformed and adjusted one by one.

Another point is that RightJS helps you utilize things like options, events, dom-manipulations, etc. in a single uniformed way. And that's too the part of the terminal fun idea. There are might be dozens of ways of dealing with things outside of the framework and if there are no regulations those things might cause collisions. RightJS provides you a conventional way of dealing with those things, which at one side reduces the likelihood of troubles, and on the other brings consistency across all the applications, making it easier to work with.

--

There are lots of things that RightJS takes care of, but I think it's all out of this article theme and then you can find all of them on the official site.

So for now I guess that's all. Hope I gave you some sense of perspective and food for thoughts on the topic.

Take care,
Nikolay

Friday, January 15, 2010

So, is really jQuery 1.4 30'000 times faster?

Okay, I admit that, when it comes to RightJS promotion, sometimes I do behave like a bastard and have no mercy over women and foreigners. But this jQuery 1.4 celebration is well out of my league! :)

I probably should just shut up about them publishing the release saying it weights 24k, and I've seen people today screaming "oh, it like the good old times!", and then they realize that it says in breakers "(gzipped)" and find out that the actual script weight is 70k. For a comparison, minified the same way Prototype weights 72k.

But why I will never get tired of mocking jQuery users is that performance optimization graph John posted over there http://jquery14.com/day-01/jquery-14. In case they will fix it, I've saved a copy over here.

Basically what it says is that jQuery 1.4 makes 30'000 (oh yeah baby, thirty thousands) times less calls than 1.3.2. Well to be honest in just a few operations. Seeing people making things like that I couldn't resist it and fired up my trusty Shakker, which bluntly compares frameworks performance feature by feature.

So what do we see? They did optimized it awright. Certainly not 30'000 times, more like 30% in average. At least it looks okay in FF

But when you run the test in Safari and Opera, you can see that it actually getting slower in some operations

And that's not the end. This test performs comparison on collections handling. And if you take a look inside of the new "fast" jQuery methods, you'll find out that they simply put collections processing in plain spaghetti JavaScript code. That means that it will and is working fast on collections, but when you process single elements (which is the majority of the cases), those methods will work slower.

This is about it. You can find all my tests code over here Shakker

But good luck with the 2 weeks orgy though 8)