Wednesday, July 16, 2014

Big Data Hadoop Hive SQL Query Hello World

Big Data Hadoop Hive SQL Query Hello World

Prerequisite
  • Big Data 
  • Hadoop 
  • SQL

If you are reading this blog you should know about Big Data and Hadoop.

Big Data is a technology revolution in the RDBMS world, however big data hadoop distributed file system can be written as a flat file with different formats like CSV, Tab Delimited etc.,

Also in order to process these data you need to be an expert in Java to write a Map Reduce program.

To make use of Big Data for non-Java users like Data Analysts, there is feature to Query the flat files using SQL has been introduced. This is Apache Hive https://hive.apache.org/

http://en.wikipedia.org/wiki/Apache_Hive

Hive was introduced by Facebook and now used by Netflix. It is a powerful querying tool in Big Data hadoop.

Basically Hive is capable of transforming your SQL queries into Map Reduce programs.


The following are the steps to be done

1. Create Hive Table with Meta data information
2. Load data into Hive Table ( 2 Types )
      a. Loading data in local file system to Hadoop & Hive
      b. Loading data in hadoop file system to Hive
3. Query the table


I have a Test Data as below, ( It has 2 fields ID & PHONE NAME)

1,iphone
2,blackberry
3,nokia
4,sony
5,samsung
6,htc
7,micromax


To get started you need have hive installed already and hadoop file system configured with Name Node, Job Tracker, Data Node, Task Tasker etc.,


Step 1:  Launch the hive console from the command line / terminal



Step 2:  Create the table with the 

CREATE TABLE PHONE ( ID INT, PHONE_NAME STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;



Step 3:  Loading data in local file system to the table 

LOAD DATA LOCAL INPATH '/home/training/PHONE.txt' OVERWRITE INTO TABLE PHONE;


Step 4:  Query your table which is created and data loaded in Hive

select * from PHONE;



Well you should be good with local mode.


Let's have a quick peek at the Server Mode (Type 2). If you have to load data from Hadoop File to Hive, first we need to send the file from local file system to hadoop file system.


Step 1: Place the file from local file system to HDFS (Hadoop Distributed File System)

hadoop fs -put PHONE.txt



Step 2: Verify if the file has been placed in the HDFS

hadoop fs -ls PHONE*



Step 3:  Create the table with meta data information

CREATE TABLE PHONE_SERVER ( ID INT, PHONE_NAME STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;



Step 4:  Load the data from HDFS to HIVE Table

LOAD DATA INPATH '/user/training/PHONE.txt' OVERWRITE INTO TABLE PHONE_SERVER;

Step 5: Verify by performing a SQL Query and check the results

select * from PHONE_SERVER;





Tuesday, July 15, 2014

Big Data Hadoop Hive Getting Max of a Count


This Example about a query to get Max of a count.

Pre-Requiste
- Basic Hadoop Knowledge
- Basic Hive Knowledge
- Basic SQL Knowledge


My Input Data is something like as below,

1,iphone,2000,abc
2,iphone,3000,abc1
3,nokia,4000,abc2
4,sony,5000,abc3
5,nokia,6000,abc4
6,iphone,7000,abc5
7,nokia,8500,abc6
Problem:
In Hive we can perform group by as we do in ANSI SQL Example as below,

select d.phnName,count(*) from phnDetails d group by d.phnName

The output of the above query is as below,

iphone 3
nokia 3
sony 1

You might have a scenario something like to retrieve only values for equal to Max

Example if you need output like as below,

iphone 3
nokia 3

Resolution:

We need to use multiple Sub Queries to perform this operation

select c.phnName, c.counter 
from 
(select d.phnName as phnName, count(*) as counter from phnDetails d group by d.phnName ) c 
join 
(select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f) g 
where c.counter = g.countmax;

Output is as below,



Queries built in multiple iterations as below,

CREATE TABLE phnDetails ( id INT, phnName STRING, price INT, details STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '/home/training/Phone/phones.txt' OVERWRITE INTO TABLE phnDetails;

select * from phnDetails;


select d.phnName, count(*) from phnDetails d group by d.phnName;


select c.phnName, c.counter from 
(select d.phnName as phnName, count(*) as counter from phnDetails d group by d.phnName ) c ;

select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f ;

select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f ;

select c.phnName, c.counter 
from 
(select d.phnName as phnName, count(*) as counter from phnDetails d group by d.phnName ) c 
join 
(select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f) g 
where c.counter = g.countmax;

Android Read and Write PDF File using iText


This tutorial is about Android Read and Write PDF File using iText  API

Pre-Requiste
- Basic Android Knowledge
- Basic Java Knowledge
- Basic iText Knowledge


Sample Screenshots for the demo






activity_main.xml
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:orientation="vertical"
    android:layout_gravity="center"
    tools:context=".MainActivity" >
    <TextView
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:gravity="center"
        android:textAlignment="center"
        android:text="Android Read/Write File" />

    <EditText
        android:id="@+id/fname"
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:hint="File Name"
        android:text="sample_pdf_file" />

    <EditText
        android:id="@+id/ftext"
        android:layout_width="fill_parent"
        android:layout_height="100px"
        android:hint="File Text"
        android:text="Hello World" />

    <Button
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:id="@+id/btnwrite"
        android:text="Write File" />

    <EditText
        android:id="@+id/fnameread"
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:hint="File Name"
        android:text="sample_pdf_file" />

       <Button
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:id="@+id/btnread"
        android:text="Read File" />
       <TextView
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:id="@+id/filecon" />
</LinearLayout>

FileOperations.java
package com.example.readwrite;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.IOException;
import java.io.StringWriter;

import android.util.Log;

import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Paragraph;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.text.pdf.parser.PdfReaderContentParser;
import com.itextpdf.text.pdf.parser.SimpleTextExtractionStrategy;
import com.itextpdf.text.pdf.parser.TextExtractionStrategy;

public class FileOperations {
    public FileOperations() {
    }

    public Boolean write(String fname, String fcontent) {
        try {
            String fpath = "/sdcard/" + fname + ".pdf";
            File file = new File(fpath);
            // If file does not exists, then create it
            if (!file.exists()) {
                file.createNewFile();
            }

            // step 1
            Document document = new Document();
            // step 2
            PdfWriter.getInstance(document,
                    new FileOutputStream(file.getAbsoluteFile()));
            // step 3
            document.open();
            // step 4
            document.add(new Paragraph("Hello World!"));
            document.add(new Paragraph("Hello World2!"));
            // step 5
            document.close();

            Log.d("Suceess", "Sucess");
            return true;
        } catch (IOException e) {
            e.printStackTrace();
            return false;
        } catch (DocumentException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
            return false;
        }
    }

    public String read(String fname) {
        BufferedReader br = null;
        String response = null;
        try {
            StringBuffer output = new StringBuffer();
            String fpath = "/sdcard/" + fname + ".pdf";

            PdfReader reader = new PdfReader(new FileInputStream(fpath));
            PdfReaderContentParser parser = new PdfReaderContentParser(reader);

            StringWriter strW = new StringWriter();

            TextExtractionStrategy strategy;
            for (int i = 1; i <= reader.getNumberOfPages(); i++) {
                strategy = parser.processContent(i,
                        new SimpleTextExtractionStrategy());

                strW.write(strategy.getResultantText());

            }

            response = strW.toString();

        } catch (IOException e) {
            e.printStackTrace();
            return null;
        }
        return response;
    }
}

MainActivity.java
package com.example.readwrite;

import android.app.Activity;
import android.os.Bundle;
import android.view.View;
import android.widget.Button;
import android.widget.EditText;
import android.widget.TextView;
import android.widget.Toast;

public class MainActivity extends Activity {
    EditText fname, fcontent, fnameread;
    Button write, read;
    TextView filecon;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        fname = (EditText) findViewById(R.id.fname);
        fcontent = (EditText) findViewById(R.id.ftext);
        fnameread = (EditText) findViewById(R.id.fnameread);
        write = (Button) findViewById(R.id.btnwrite);
        read = (Button) findViewById(R.id.btnread);
        filecon = (TextView) findViewById(R.id.filecon);
        write.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View arg0) {
                // TODO Auto-generated method stub
                String filename = fname.getText().toString();
                String filecontent = fcontent.getText().toString();
                FileOperations fop = new FileOperations();
                fop.write(filename, filecontent);
                if (fop.write(filename, filecontent)) {
                    Toast.makeText(getApplicationContext(),
                            filename + ".pdf created", Toast.LENGTH_SHORT)
                            .show();
                } else {
                    Toast.makeText(getApplicationContext(), "I/O error",
                            Toast.LENGTH_SHORT).show();
                }
            }
        });
        read.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View arg0) {
                // TODO Auto-generated method stub
                String readfilename = fnameread.getText().toString();
                FileOperations fop = new FileOperations();
                String text = fop.read(readfilename);
                if (text != null) {
                    filecon.setText(text);
                } else {
                    Toast.makeText(getApplicationContext(), "File not Found",
                            Toast.LENGTH_SHORT).show();
                    filecon.setText(null);
                }
            }
        });
    }
}
AndroidManifest.xml
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package="com.example.readwrite"
    android:versionCode="1"
    android:versionName="1.0" >

    <uses-sdk
        android:minSdkVersion="14"
        android:targetSdkVersion="18" />
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
        <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>

    <application
        android:allowBackup="true"
        android:icon="@drawable/ic_launcher"
        android:label="@string/app_name"
        android:theme="@style/AppTheme" >
        <activity
            android:name="com.example.readwrite.MainActivity"
            android:label="@string/app_name" >
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />

                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>

</manifest>


Make sure you have the itextpdf-5.5.1.jar in the right location as below,