Wednesday, September 28, 2011

Sun-jdk on ubuntu

I've been unable to login to topcoder arena for several months simply because jdk on my machine wasn't set up properly. In particular Iced tea( open-jdk) was installed and it seems to be buggy yet. I couldn't find sun-jdk on repositories that I used (mirrored on local server of college). So I went looking for ways of manually updating to sun-jdk instead of open-jdk. I am proudly presenting you this super excellent tutorial on ubuntu forms on how to do it.

It works! At least for me it did :)

PS : I've now created a launcher which opens up arena thanks to all the searching I did today. Just keep this in mind you might be better off giving the path of sun-jre's javaws directly in launcher.

Saturday, July 2, 2011

Python !

I have not published much on this coding blog for sometime primarily because I myself have not coded enough in past some months. That however, is not the only reason - the other reason is that I have tremendously reduced blogging as a whole.

Title of this post is a give away. I am starting to learn python now and I am really really excited about it. I am also learning Django , a Python based web framework very much like Ruby on Rails( except it is Ruby based) that I used last year. I hope to learn python by solving some Spoj problems.

Any tips for writing better/faster/cleaner code in python ?

Friday, February 25, 2011

Prefix sums & Fenwick trees

Today I am going to post about prefix sums & an associated topic Fenwick trees ( aka Binary Indexed Trees or BIT in short ). This is one data structure I absolutely love and just for information BIT is so powerful that I use it in almost every third problem.

In this whole post I'll use Q for the number of queries to be answered  & N to denote input size.

Problem : We are given an array of numbers, say A.  A query consists of two indices i & j we need the sum A[i] + A[i+1]...A[j] quickly. An update consists of two numbers i & val such that you want A[i] to be incremented by val.

Solution 1 : Store a 2D table - Sums. Sums[i][j] stores the sum of A[i] + .. A[j].
This takes O(N^2) space, O(N^2) pre-process time and O(1) per query. Space requirement usually makes it virtually useless. Also we would shortly consider dynamic arrays as well.  Then for every update made to array, a lot of entries (O(N^2) infact ) are to be recomputed which makes it very slow.

Observation : If we can answer a simpler version of this problem where i = 1, we an answer the general version as well. A[1] + A[2] ... A[i] is called ith prefix sum of A, which I denote with Prefix[i].  Our query is just Prefix[j] - Prefix[i-1].

Solution 2 : So create a new array for prefix sums. Compute prefix sums for all i in O(N) time total using Prefix[i] = Prefix[i-1] + A[i].And now we can answer each query in constant time . But again if array is made dynamic, for every update we may have to change O(N) prefix sums. Too bad !

Solution 3 : Partition the array A in sqrt(N) segments each of size ( N / sqrt(N) ) = sqrt(N) and store sum for each segment. Now when you query an A[i] + ... + A[j] , first break it into prefix sum queries. Say now you want to answer Prefix[k]. So keep moving over segments which are completely in [1...k] adding their sums. There is utmost 1 incomplete segment. Move over each element of this which in included in range [1..k] . There are O(sqrt N) full segments you moved over and also O(sqrt N) elements you moved over in last incomplete segment. So query time is O(sqrt N). Also update time is O(sqrt N) as we have to update 1 segment only.
Easy to code, fast enough usually.

Solution 4 BIT. Note in prefix sums we store sum of a segment at each point in prefix array. That gives a good query time ( O(1) ) but large number of segments contains a given point which makes them bad for updates. As we saw if we make segments of smaller sizes such that each point in involved in only 1 segment, we can speed up the updates on cost of queries.  We need something in middle. Every element should be in more then 1 segment but not in too many either. Atleast  we do get a feel that a good solution would have to partition input array in segments and store sum for each.  So question to ask is what size (or form) segments ?

Solution to this is instead of using a fixed size segments , use a variable scheme . Suppose you have array A given. I will construct a new array called BIT which holds some segment sums.
In particular, if i has k trailing zeroes in binary representation,
BIT[i] = A[i] + A[i -1]... A[i - 2^k + 1].

i.e it stores sum of last 2^k entries of A. This looks random for sure. But its powerful. Lets see how.

i ) Querying using BIT : As before break query in prefix queries. Say we want to answer prefix query [1...i]. Now look at BIT[i]. It contains sum of a segment of A in particular of length 2^k where there are k trailing zeroes in binary representation of i. We can add these 2^k entries in O(1) time : BIT[i] is this sum by definition. What about other entries ? Well that sum is nothing but prefix query i - 2^k which we can find recursively. Question to ask is how fast is this ? Well it takes no more then O(log N) time. How ? Every recursive calls removes the right most 1 from binary representation of i. There are atmost log(N) digits in binary of N. Hence the time.

2) Updating BIT : Almost same . If you change A[i], then what all BIT values change ? BIT[i] changes for sure. What is the next j such that segment of BIT[j] contains i ? It would be one such element st it has k trailing zeroes and j - i <= 2^k.  Also we want smallest such j so that we can iterate over all in some order.
Convince yourself that this j is nothing but i + 2^k where k is number of trailing zeroes in i. Once you understand this, you are done as now when you change BIT[j] you propagate the update further exactly like this.

Now only 2 more things and you know it all :
1) Convince yourself that you actually dont need to store A anymore. Storing BIT array is sufficient for all purposes.
2) How to find 2^k where k is number of trailing zeros ? Again convince yourself that this nothing but : i & (-i)

Now lets throw in some code here :

// Gives prefix sum of segment [1...i]

int query(int i)   
 int ans = 0;
 while( i > 0)
   ans += BIT[i];
   i -=  i & (-i);
 return ans;

// Add val to A[i].

void update(int i, int val)
 while(i <= N)
   BIT[i] += val;
   i += i & (-i);

You see ? Under 10 lines of code you have both O(log N) updates and O(log N) queries. Power of this data structure is huge  and only more thought and practice can help you understand that.  This in practice is lightening fast with such a low constant involved in that O(log N).

I should link up some problems soon. Till then happy Binary indexing :)

Tuesday, December 14, 2010

Rolling Indices

I am almost on my way back to home after attending Amritapuri regionals of ICPC. My team Proof has confirmed a berth in world finals to be held in Egypt 2011.

As promised I am going to revive this blog of mine. Though I don't have anything specific in mind, I plan to just discuss some cool tricks & problems which I find challenging. While I post stuff here, I expect all the readers to contribute too - in terms of problems and/or solutions.

I am today starting with a technique I call as Rolling Indices. 

Sample Problem :

You are given an array of numbers. You want to find if there are two numbers a,b in array such that a + b = S for some constant S.  If you don't know the technique yet, you may like to give this problem a little thought.

Naive Brute Force Solution :

Run over  three counters say i,j & check if A[i] + A[j] = S. This takes O(N^2) time & is clearly costly.

A better solution :

Sorting is cheap here compared to time we already have. But after sorting we can do a binary search. So just run over one counter say i & do a binary search for S - A[i]  in array. This takes time O ( N * log N) in sorting & same in processing. Little fast ! Fast enough yet ? No !

Rolling Indices technique :

Lets sort the array. & run on two indices say i & j ( yes these would be our rolling indices ) . We want to check as before if A[i] + A[j] = S or not. If A[i] + A[j] = S then you are really done. What if its not the case ? Maybe A[i] + A[j] > S. But pause here & ask yourself is this current value of i of any use at all ? All values of i & j you are going to consider hence after would give only a higher sum of A[i] + A[j] ( Array is sorted, remember ! ) which would not match S & so its no use to consider A[i] ever. Just forget this i & increment.

What if A[i] + A[j] < S  ? Simple. We have not yet considered all possibilities of i yet. Just increment j.

Analysis : We took O(N * log N ) time in sorting. Lets forget that & look at processing phase.  In each step we move i or j. We can't take more then 2 * N steps before we can't do anything . So processing is O(N)

This looks like an innocent & simple technique. In fact its as costly as binary search one with sorting being the bottle neck. But this was a sample problem just to introduce to concept of rolling ,  I can assure you that in "real" problems this is amazingly good. In particular sorted sequences arise naturally usually ( like prefix sums of positive numbered arrays ). Anycase tougher problems take more then N * log N time anyway so soring becomes cheap. Also this is trivially easy to implement & admits very neat & short code.

If you want to try more : try solving these :

1) Find if there are 3 indices i, j, k such that A[i] + A[j] = A[k] . Try for O(N^2).
2) You are given a rectangle with each entry being 0 or 1. Find if there is a sub matrix with sum exactly S. Try for O(N^3) algorithm.
3) If you really want an advanced problem try Hungry Bear from codechef.

I welcome any readers post any problem link or code for any problem solvable by this technique.

  • I realized only mid way , a post on prefix sums should ideally be there before this one . So give me some time & I should be posting on prefix sums soon.

Friday, August 27, 2010

String concatecation in java

So I came across a few articles claiming that string concatenation using + or (concat function is slower then using a buffered approach (StringBuffer or StringBuilder). I came across this post on the topic & ran the tests on my machine.
Results I got were something like this:
On avg concat was taking 4 sec & append was taking 1 ms. ( Number of iterations- 20000).

So I kept wondering why is it so slow really ? Is it because there is excessive copying or mutliple times space allocation ! So I modified Paul's code to add some extra testing function which you can find here at pastebin. Running the results on my & my friends machine gives very very different results:
                              Concat          Append         Alloc         Set Value
My machine          4000              2ms               800           80
Friends machine   8000              2ms               400           1200

Large differnce in concat times could possibly be explained due to speed of machines. But how one explains a fuzzy pattern ? We found later that he was on Java 1.6 & I was on java 1.5 . But then why would my setval function be so fast & his alloc be so slow ? What are the changes done behind the curtain ?

If anyone of you does the same test,(which is easy- copy the file & run it!) do log ur results & reply back to me...

Update: Also we thought about it later- I was on Linux & he was on windows. Is memory allocation dependent on OS too ?

Monday, August 9, 2010

P != NP !

Vinay Deolalikar, an extremely common & dull Indian name as such.
But that is what I would have said a few hours back on this name. What do I say now ? Well - this might well be the name of the most famous computer scientist of all times. Yes you read it right - of all times.

Why ? A guy with this name who works as a researcher  at HP labs, Palo Alto,  has claimed that he has a proof to establish that P!=NP.  What it means is that if this is true, not only he takes home 1 million $, THE most difficult & important problem of computer science would be solved .

His paper is awful 103 pages long & a preliminary version can be found here.This only brings one name to my mind- Andrew Wiles ! If you don't know why- well you don't deserve to be reading this blog !

You know what is startling ? This guy was not even a computer scientist. He was an electric engineering major from IITB only. Even his PhD was in electrical engineering. Though on his homepage, he describes himself as a mathematician.
It would be a long stick in the ass of all computer scientists of this world if an inter-disciplinary researcher solves the hardest nut.

Plus its cool that he was born & brought up in Delhi & he passed out of a university which is the nearest kin to mine.(IIT Delhi).
I am so excited.

Y0 baby. C0de monkey is jumping restlessly. :)

Tuesday, July 6, 2010

Raining Parabolas

See the problem statement here

Problem: (In abstract) You have n length strip. You at any time choose a segment of strip [x0,x1] & apply paint on it such that paint applied on ith square is a*i^2 +b*i + c for constant a, b, c given with every paint trial.
Given any segment [x0,x1] we are to find the sum of paint amounts applied to this segment .

Constraints: We need an algorithm where both query for a given segment & updating a paint attemp take atmost O(logn) time.

Hunch: My first hunch is to go for segment trees ofcourse. What is problematic here is that sum amount to be applied is not uniform across all squares. I can think of only storing with each segment its total sum. So that query becomes O(logn) only. I dont know how to update yet.