Tuesday, November 30, 2010

Parallel Search

Problem:
Write a Java class that allows parallel search in an array of integer. It provides the following static method:
public static int parallelSearch(int[ ] a , int numThreads)
This method creates as many threads as specified by numThreads, divides the array a into that many parts, and gives each thread a part of the array to search for sequentially. If any thread finds x, then it returns an index i such that A [ i ] = x. Otherwise, the method returns -1.
Solution:


package org.zero.concurrent.chap01;

import java.util.Arrays;

public class ParallelSearch {
    private static int index = -1;

    public static int parallelSearch(int search, int[] in, int numThreads) {
        // partition
        int partitionSize = in.length / numThreads;
        Thread[] threads = new Thread[numThreads];
        int end = 0;
        int begin = 0;
        // search
        for (int i = 0; i < numThreads; i++) {
            end = begin + partitionSize;
            if (i == numThreads - 1 || end > in.length) {
                end = in.length;
            }
            Search target = new Search(begin, end, in, search);
            System.out.println(target);
            threads[i] = new Thread(target);
            threads[i].start();
            System.out.println(threads[i].getName());
            begin = end;
        }
        return index;
    }

    public static void main(String[] args) {
        int[] a = new int[100];
        for (int i = 0; i < a.length; i++) {
            a[i] = (int) (Math.random() * 10);
        }
        int parallelSearch = parallelSearch(2, a, 7);
        if (-1 == parallelSearch) {
            System.out.println("not found");
        } else {
            System.out.println("found at: " + parallelSearch);
        }
    }

    private static class Search implements Runnable {
        int begin;
        int end;
        int[] a;
        int x;

        public Search(int begin, int end, int[] a, int x) {
            super();
            if (end < begin) {
                throw new IllegalStateException();
            }
            this.begin = begin;
            this.end = end;
            this.a = a;
            this.x = x;
        }

        @Override
        public void run() {
            for (int i = begin; i < end; i++) {
                if (x == a[i]) {
                    index = i;
                    System.out.println(i + " "
                            + Thread.currentThread().getName());
                    // break;
                }
            }
        }

        @Override
        public String toString() {
            return "Search [a=" + Arrays.toString(a) + ", length=" + a.length
                    + ", begin=" + begin + ", end=" + end + ", x=" + x + "]";
        }

    }
}
P.S: Any suggestion to improve the code is welcome.

Saturday, November 27, 2010

Product Versioning: Embedding packaging information

Almost serendipitously, I discovered the Java Package class today. One must wonder how possibly this is going to make a difference (aka increase their geek quotient or coolness factor). Well the beauty lies in the details.

Problem Statement: You commit your code to the cvs and after going through the complete lifecycle experience your code finally sees the light of the day, much to your chagrin that the users discovered some bug, even after all those unit testing and that pragmatic ranting ;) But, then you are the rock star developer who had already discovered the problem and fixed it :D but how do you know if user is not using some older version of your package???


Solution: You can embed the information in the manifest file at build time which could be read by exploding the jar, simple!! NO, most of your users wouldn't (shouldn't) know it.It would be really nice if you can print this star-studded information at the beginning of the code execution. You can make this as the first line of your log file or may be a separate file which could be used for bug reporting, options are open.. How do you do it?

Step 1: Create an annotation.

package org.zero;

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Target(ElementType.PACKAGE)
@Retention(RetentionPolicy.RUNTIME)
public @interface MyVersionAnnotation {

    String version();

    String revision();

    String date();

    String user();

    String url();
}

Step 2: Generate a class package-info.java
@MyVersionAnnotation(date = "2010-11-26", revision = "11", url = "http://onjava.com/pub/a/onjava/2004/04/21/declarative.html?page=3", user = "nitin", version = "123")
package org.zero;

Step 3: Create a class to access this information;



/**
 * This class finds the package info for mypackage and the MyVersionAnnotation
 * information.
 */
public class PackageDemo {
    private static Package myPackage;
    private static MyVersionAnnotation version;

    static {
        myPackage = MyVersionAnnotation.class.getPackage();
        version = myPackage.getAnnotation(MyVersionAnnotation.class);
    }

    /**
     * Get the meta-data for the mypackage package.
     *
     * @return
     */
    static Package getPackage() {
        return myPackage;
    }

    /**
     * Get the mypackage version.
     *
     * @return the mypackage version string, eg. "0.6.3-dev"
     */
    public static String getVersion() {
        return version != null ? version.version() : "Unknown";
    }

    /**
     * Get the subversion revision number for the root directory
     *
     * @return the revision number, eg. "451451"
     */
    public static String getRevision() {
        return version != null ? version.revision() : "Unknown";
    }

    /**
     * The date that mypackage was compiled.
     *
     * @return the compilation date in unix date format
     */
    public static String getDate() {
        return version != null ? version.date() : "Unknown";
    }

    /**
     * The user that compiled mypackage.
     *
     * @return the username of the user
     */
    public static String getUser() {
        return version != null ? version.user() : "Unknown";
    }

    /**
     * Get the subversion URL for the root mypackage directory.
     */
    public static String getUrl() {
        return version != null ? version.url() : "Unknown";
    }

    /**
     * Returns the buildVersion which includes version, revision, user and date.
     */
    public static String getBuildVersion() {
        return PackageDemo.getVersion() + " from " + PackageDemo.getRevision()
                + " by " + PackageDemo.getUser() + " on "
                + PackageDemo.getDate();
    }

    public static void main(String[] args) {
        System.out.println("mypackage " + getVersion());
        System.out.println("Subversion " + getUrl() + " -r " + getRevision());
        System.out.println("Compiled by " + getUser() + " on " + getDate());
    }
}
PS: Source code courtesy org.apache.hadoop.util.VersionInfo

Friday, September 24, 2010

Friday, July 23, 2010

Talking common sense

'Common sense is quite uncommon' is what I was told whenever I failed to cut my way through the chaos. As much as, I was frustrated with my failure to learn from my failures. I equally wondered if there is a way to master the technique. Today, I wish to share my learnings with my readers.

'Common sense is a combination of experience, training, humility, wit and intelligence'. Wow! now that sounds like a well balanced equation with known variables. Doesn't that excites you to take control of participating variables and work on your weaker areas to improve your gross wisdom quotient! Let's work on that too;
  • Experience is nothing but the accumulation of knowledge or skill that results from direct participation in events or activities. Simply put, just do it will alone help learn things in small digestible chunks. Fear of failure is the greatest impediment to learning. Make it your friend!! Adapt a fail fast approach. trust me it always helps and at least you end up enriching your experience.
  • Training refers to the acquisition of knowledge, skills, and competencies as a result of the teaching. But, where do I start. Look around yourself. Be relevant to your eco-system. Learning something which is immediately useful keeps you motivated and provides avenues to 'Apply Yourself'. Remember no one learns or appreciates swimming by reading the book!!
  • Pride eats up your brain, being humble means taking good and bad in your strides and accept them as part of our lives. It always helps to balance your emotions [I know I'm being preachy, I need a lot ground work here.. but then whose blog is this ;) ]
  • When in trouble use your humor! It sheds the unnecessary weight from your shoulder and helps you relax. It's very important. Enjoy! every moment.
  • Intelligence is you ability to comprehend, to profit from your experience which now looks like product of the healthy concoction of the above ingredients.. the 'executable knowledge' !!
There you have it! the prescription to work on your common sense. Happy learning. You may now join you hands for a big round of applause... Have fun!!

Tuesday, June 15, 2010

Ride to Kudremukha

It's been so long that I last went out on a ride. Then, there these set of really enthusiastic guys who infected me with the travel bug. Get Off Your Ass!! was the war cry for us and we simply carried our back-packs and hi the road straight .. late night driving, after a really long week was proving to be physically taxing on us and we decided to call it the day at Arasikere, starting Saturday morning we visited Halebidu, Belur on our way and they took a detour to Chikmanglur and then through Ghas we reached Horanadu, little did we knew that there is a majestic temple of Goddess Annapurna. The place was really peaceful, one may just go there and stay over the weekend. It was blissful to walk in the clouds and funny to wach clouds enter our rooms and pour water on our belongings. Trust me we were drenched to core, so much so, that even our spirits must have also satiated their thirsts. We slept like Dogs, no idea when did the dawn broke on us and after one quick shower we headed towards the temple. After seeking the blessings, we hit the road again and after loosing our way to he place we went through the ghats uphill to Kudremukh, KIOCL colony and then to Lakya Dam. It was just plain fun to drive in the rains and watch the sight of nature's beauty along coffee estates, many of which just could not captured, lest we must loose our camera to heavy rains.

On our return trip, we reached Hassan town and then took the road to Bangalore. Journey was quite eventful and we have a whole lot of fun stories to share.. I miss my old gang.. God knows .. kiski nazar lag gayi :|

Keep walking..

Find the pictures here

Maps: To, From

Monday, June 7, 2010

Training for long distance running

Suddenly something! No, way.

I was bitten by this bug during my school days. Just that, I lack discipline and often my initial enthusiasm causes more harm than necessary. Often my strong will to complete certain distance over powers my muscular strength :)

Basically, this marathon thingy is more than just adding miles to my legs, I must admit it is highly rejuvenating, some how I feel very good after half an hour of running. It even helps me fight emotional voids being created in me.

I pledge to complete my Sunfeast half marathon next year.

You may take tips from the references below. Do share your tips.

References:
  • http://www.runnersworld.com/
  • http://marathontraining.com/

Tuesday, May 25, 2010

Hadoop application packaging

Job jar must be packaged as below;

job.jar
|--META-INF
|----MANIFEST.INF
|------Main-Class: x.y.z.Main
|--lib
|---- commons-lang.jar Note: Place your dependent jars inside lib directory
|--org.zero
|---- application classes here

Archiving large number of small files into small number of large files

A small file is one which is significantly smaller than the HDFS block size (default 64MB).

We have a lot of data feeds in the range of 2MB per day, storing each as a separate file is non-optimal.

The problem is that HDFS can't handle lots of files, because, every file, directory and block in HDFS is represented as an object in the namenode's memory, each of which occupies 150 bytes. So for 10 million files, each using a block, would use about 3 gigabytes of memory. Scaling up much beyond this level is a problem with current hardware. Certainly a billion files is not feasible.

Furthermore, HDFS is not geared up to efficiently accessing small files: it is primarily designed for streaming access of large files. Reading through small files normally causes lots of seeks and lots of hopping from datanode to datanode to retrieve each small file, all of which is an inefficient data access pattern.

Also, HDFS does not supports appends (follow http://www.cloudera.com/blog/2009/07/file-appends-in-hdfs/).

Known options are;
  1. Load data to Hbase table and periodically export them to files for long term storage. Some thing like we have product log for a particular date/timestamp against the content of the files stored as plain text in Hbase table.
  2. Alternatively, we can treat these files as pieces of the larger logical file and incrementally consolidate additions to a newer file. That is, file x was archived on day zero, the next day new records are available to be archived. We will rename the existing file to let's say x.bkp and then execute a mapreduce job to read the content from the exiting file and the new file to the file x.
  3. Apache Chukwa solves the similar problem of distributed data collection and archival for log processing. We can also take inspiration from their and provide our custom solution to suit our requirements, if needed.

Saturday, May 22, 2010

One wish

को‌ई गाता मैं सो जाता

संस्कृति के विस्त्रित सागर मे
सपनो कि नौका के अंदर
दुख सुख कि लहरों मे उठ गिर
बहता जाता, मैं सो जाता ।

आँखों मे भरकर प्यार अमर
आशीष हथेली मे भरकर
को‌ई मेरा सिर गोदी मे रख
सहलाता, मैं सो जाता ।

मेरे जीवन का खाराजल
मेरे जीवन का हालाहल
को‌ई अपने स्वर मे मधुमय कर
बरसाता मैं सो जाता ।

को‌ई गाता मैं सो जाता
मैं सो जाता
मैं सो जाता

 - हरिवंशराय बच्चन

बूँद फिर मोती बने ...


एक बूँद 
------------
ज्यों निकल कर बादलों की गोद से
थी अभी एक बूँद कुछ आगे बढ़ी
सोचने फिर फिर यही जी में लगी
हाय क्यों घर छोड़ कर मैं यों बढ़ी
मैं बचूँगी या मिलूँगी धूल में
चू पड़ूँगी या कमल के फूल में
बह गयी उस काल एक ऐसी हवा
वो समन्दर ओर आयी अनमनी
एक सुन्दर सीप का मुँह था खुला
वो उसी में जा गिरी मोती बनी

लोग यौं ही हैं झिझकते सोचते
जबकि उनको छोड़ना पड़ता है घर
किन्तु घर का छोड़ना अक्सर उन्हें
बूँद लौं कुछ और ही देता है कर !

-अयोध्या सिंह उपाध्याय 'हरिऔध`

A Dedication

हम दीवानों की क्या हस्ती

हम दीवानों की क्या हस्ती,
आज यहाँ कल वहाँ चले
मस्ती का आलम साथ चला,
हम धूल उड़ाते जहाँ चले

आए बनकर उल्लास कभी,
आँसू बनकर बह चले अभी
सब कहते ही रह गए,
अरे तुम कैसे आए, कहाँ चले
किस ओर चले? मत ये पूछो,
बस चलना है इसलिए चले

जग से उसका कुछ लिए चले,
जग को अपना कुछ दिए चले
दो बात कहीं, दो बात सुनी,
कुछ हँसे और फिर कुछ रोए
छक कर सुख दुःख के घूँटों को,
हम एक भाव से पिए चले

हम भिखमंगों की दुनिया में,
स्वछन्द लुटाकर प्यार चले
हम एक निशानी उर पर,
ले असफलता का भार चले

हम मान रहित, अपमान रहित,
जी भर कर खुलकर खेल चुके
हम हँसते हँसते आज यहाँ,
प्राणों की बाजी हार चले

अब अपना और पराया क्या,
आबाद रहें रुकने वाले
हम स्वयं बंधे थे, और स्वयं,
हम अपने बन्धन तोड़ चले

- भगवतीचरण वर्मा  

The above poem pretty much summarizes the way I wish I could live my life... trying each moment... keep walking

Sunday, May 9, 2010

How about teaching to learn?


Since, my childhood days I was always told that sharing knowledge improves your learning. However, our social conditioning is such that we are not really comfortable to share primarily because;
  • Fear of being dispensable, because there are others who may replace you.. is the single most important reason for people to 'hoard' knowledge.
  • Fear of being exposed, as sharing might expose your ignorance. 
  • ..
  • ....
There could be many more, the idea here is not to create an exhaustive list of reasons for people to escape ...  the idea here is to make a public assertion that I no longer want to be a passive consumer of information but create information, distil it for people around me find it easy to consume and add greater value to make the world a better place to live.

Hope, that makes a true dedication to my Mother on this day. I seek your blessings.

Wednesday, April 28, 2010

Ubuntu 9.10: Wifi connection fix

You have trouble connecting to the wireless network on UBuntu 9.10 try this
sudo apt-get install --reinstall bcmwl-kernel-source
 Hope that helps. It works for me now.

Sunday, March 28, 2010

Cascading: How does cascading decides which fields should go to a column family?

I was playing with Cascading code sample as given here.

Problem Statement: let's say we have three fields in a tuple for e.g.
line_num, lower, upper, double
1, a, A, AA

and I wish to add double to its own column family or lets say club it with an existing column family 'right' How do I do that.

Solution:
String tableName = "DataLoadTable";
        Fields keyFields = new Fields("line_num");
// add a new family name
        String[] familyNames = new String[] { "left", "right", "double" };
// group your fields together in the order in which you would like them to be 
// added to column families
        Fields[] valueFields = new Fields[] { new Fields("lower"),
                new Fields("upper"), new Fields("double") };
        HBaseScheme hbaseScheme = new HBaseScheme(keyFields, familyNames,
                valueFields);
        Tap sink = new HBaseTap(tableName, hbaseScheme, SinkMode.REPLACE);
// describe your tuple entry: add the new field
        Fields fieldDeclaration = new Fields("line_num", "lower", "upper",
                "double");
        Function function = new RegexSplitter(fieldDeclaration, ", ");

The remaining of the code remains the same as given in the example.

Either, the above was too obvious that the authors didn't talked about it in the user guide or I do not know how to describe the problem and hence was not able to find them.

Let me know if I'm wrong.

Thursday, March 11, 2010

Push Button Automation for the Humans

Have you ever been responsible for the installation of a software which should be distributed and installed on myriad set of execution environments with equally diverse sets of configuration in a department where machines are constantly being cleaned and re-imaged? Picture a situation where you are supporting various configurations and levels of your module. Rather than spending hours installing by hand, wouldn't it be great to have a way to automate the installation process so that you could just kick it off, go and get coffee, come back, and have it all installed and ready? Call it 'Push Button' automation :) (Please bear with me for throwing newer phrases)

Strange as it might seem but this hand-made configuration is proving to be a nightmare of sorts.. one might feel exhausted with out really having any sense of accomplishment by solving petty issues which should not really have come up. High time one must set up;
  • standardized installation set up
  • provide application diagnostics which must enable operations team to solve small issues in time.
Time again Apache ANT comes to my rescue.

Thursday, March 4, 2010

How does the data flows when a job is submitted to Hadoop?

Based on the discussion here, typically the data flow is like this:
  1. Client submits a job description to the JobTracker. 
  2. JobTracker figures out block locations for the input file(s) by talking to HDFS NameNode. 
  3. JobTracker creates a job description file in HDFS which will be read by the nodes to copy over the job's code etc. 
  4. JobTracker starts map tasks on the slaves (TaskTrackers) with the appropriate data blocks. 
  5. After running, maps create intermediate output files on those slaves. These are not in HDFS, they're in some temporary storage used by MapReduce. 
  6. JobTracker starts reduces on a series of slaves, which copy over the appropriate map outputs, apply the reduce function, and write the outputs to HDFS (one output file per reducer). 
  7. Some logs for the job may also be put into HDFS by the JobTracker.
However, there is a big caveat, which is that the map and reduce tasks run arbitrary code. It is not unusual to have a map that opens a second HDFS file to read some information (e.g. for doing a join of a small table against a big file). If you use Hadoop Streaming or Pipes to write a job in Python, Ruby, C, etc, then you are launching arbitrary processes which may also access external resources in this manner. Some people also read/write to DBs (e.g. MySQL) from their tasks.

Sunday, February 21, 2010

Rampyari ki khoj

What's the big deal, one might ask? Yes, I'm looking for a car and while hunting around I found a wealth of information is available about brands and products available in the market, but one finds zero information about the dealers in the local market.

Market is full of inexperienced, uneducated, incompetent salesmen who are ready grok 'the best' car to you in no time without understanding my needs or expectations. They simply seem to be oblivious about the competition and live a ,monopolistic universe of their own brand, to such an extent that a mere mention about the competition makes them turn blue.

Further, talking to old timers, I realized that your dream car may soon become a nightmare of sorts based upon your service providers, about which you may not have any reliable information. After all any machine needs skilled maintenance and I need some one whom I can trust for the pre-defined level of quality of service in lieu of the amount I spend on the same. The need for rating dealers and service stations is both timely and important.

In the mean while in order to live in the more 'real' world I soon would have to make my choices...

Tuesday, February 16, 2010

Evolving programming paradigms

The free lunch is over and it simply means that;
  • Our applications may no longer be able to automatically seek benefits of hardware up-gradations, we will have to plan, design and make room for concurrency.
  • As a developer we need to hone our skills to make good use of hyper-threading, multi-core CPUs and caching.
  • Do not get confused with marketing slogans about new found technologies. Seldom, does a technology becomes mature too fast to become main-stream, generally it is old technologies which increasingly useful with time and better understanding of its developers that can gainfully be used to solve important problems.
  • Because, our applications do not automatically gain performance with newer hardware solving performance problems and performance tuning will become an important activity.
  • Start preparing now.

Urban Area: Community or a Marketplace

'Mumbai for Mumbaikars' might bring bad taste to your mouth or a sense of ownership based upon your political affiliations. Whatever be your reaction it is however increasingly becoming important for us to delve deeper to try and understand the core issues lest we must reject them as pure nuisance.
From a pure theoretical perspective the urban population can be broadly divided into two sections;
  • Community dwellers, who inhabits a particular place and encourage symbiotic relationship between various members of the community. They largely like to take the ownership of the place for its general well being and associated identity. Such people have high tolerance towards problems as result of scarcity of resources.
  • Market dwellers, who look at an urban area as a place where market operates and they themselves form links into the demand and supply chains. Such people have high preference for quality of life.
The stated difference in their value system creates competition resulting in clashes.

What can be done about it? well .. I'm not too sure will it help if market dwellers contribute back to the community of which they are part off to avoid being  tagged as parasites and community ensures that market operates in safety  thus creating  a conducive environment to attract better talent!!

Part of the answer also requires us to understand the larger picture and avoid such local focus. Will it not help if we have more balanced growth in .. sounds wishful.

Share your mind, who knows our ideas are heard and we do reach a resolution.. let us start from ground zero again..

Tuesday, February 9, 2010

Mar Java, Mit Java..

... and with European Commission's nod the last hope of keeping the Sun shining too went away, while the whole software and business community is still evaluating the takeover deal, I must express that I feel bad about the whole thing.. there is something which is not generating positive vibes... although, some of you might discount me for being emotional, I have at least one objective reason to support my point in case. Old timers will clearly be able to spot the difference that post acquisition all the Sun domains have been distastefully defaced and simply reflects the tyranny, greed and callous attitude of the raiders.

I, seriously hope similar fate is not vetted out to the open source initiatives.

Sunday, February 7, 2010

In bad taste...

Reality television is truly a reality today and one can no longer ignore them. There popularity can be attributed primarily because it also involves its viewers. I used to love 'Boogie-Woogie' for their plain ingenuity and honesty in bringing recognition to dancing talent. The show was conducted wonderfully well.

With time however, the production value of reality shows have increased dramatically but I'm sorry to say that they leave everyone demoralized. Particularly shows like Roadies and Splitsvilla are truly disgusting.

My whole point is that such shows are increasingly creating a 'public acceptance' for foul and abusive language in public which is one very dangerous trend. They must mend their ways or else criminal action be taken against them.

Thursday, February 4, 2010

Running Java in debug mode using Apache Ant

Configure your Java task as given below;



<java classname="org.zero.Main" fork="true" failonerror="true">
            <sysproperty key="DEBUG" value="true" />
            <jvmarg value="-Xdebug" />
            <jvmarg value="-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8999" />
            <jvmarg value="-Xnoagent" />
            <jvmarg value="-Djava.compiler=none" />
            <arg value="${project.root}/conf/preferences.xml" />
            <classpath refid="project.classpath" />
        </java>

Further,  configure your eclipse debug remote application to listen to the specified port (8999 in this case).

Also, add Thread.sleep(1000); as the first line in your main method.

Execute Java task and try connecting from eclipse, you may need to increase the thread sleep time.

Saturday, January 30, 2010

Bug fix: Eclipse springsource on UBuntu 9.10

...so u have upgraded to UBuntu 9.10 and you are really impressed with its new features and you cannot control your excitement to check out your development environment and launch eclipse .. but oops! you are not able to create a new project.. none of the buttons seem to work chances are you are also suffering from the same bug follow the link for the fix  

This issue is resolved in eclipse 3.5.2 M2 release. Hope that helps.