Object recognition

Problem Statement:

Data Exploration:

Importing required libraries for Object Recognition

In [1]:

import pandas as pd
import sklearn
import numpy
import scipy
import statsmodels
import math
import matplotlib as matlab

Importing the dataset

In [2]:

train_lab = pd.read_csv("\\train\\Labels.csv")

The Dimension of the dataset is

In [3]:

train_lab.shape

Out[3]:

(50000, 2)

The names of the variables in this dataset are

In [4]:

train_lab.columns.values

Out[4]:

array(['id', 'label'], dtype=object)

Let us see the top 10 observations of the dataset

In [5]:

train_lab.head(10)

Out[5]:

	id	label
0	1	frog
1	2	truck
2	3	truck
3	4	deer
4	5	automobile
5	6	automobile
6	7	bird
7	8	horse
8	9	ship
9	10	cat

Now, as we can see that we have got images but not the intensity values, we have to read the images and extract the intensity values from the images. Before that, we need to store the locations of the image in a variable.

In [6]:

a = []
for i in range(1,50001):
    a.append("\\train\\train"+str(i)+".png") 
a[1]

Out[6]:

'\\train\\train2.png'

Pixel values of the images are extracted and stored in the variable t.

In [7]:

len(a)

t = []
for i in range(0,50000):
  t.append(scipy.misc.imread(a[i]))

t[1]

Out[7]:

array([[[154, 177, 187],
        [126, 137, 136],
        [105, 104,  95],
        ..., 
        [ 91,  95,  71],
        [ 87,  90,  71],
        [ 79,  81,  70]],

       [[140, 160, 169],
        [145, 153, 154],
        [125, 125, 118],
        ..., 
        [ 96,  99,  78],
        [ 77,  80,  62],
        [ 71,  73,  61]],

       [[140, 155, 164],
        [139, 146, 149],
        [115, 115, 112],
        ..., 
        [ 79,  82,  64],
        [ 68,  70,  55],
        [ 67,  69,  55]],

       ..., 
       [[175, 167, 166],
        [156, 154, 160],
        [154, 160, 170],
        ..., 
        [ 42,  34,  36],
        [ 61,  53,  57],
        [ 93,  83,  91]],

       [[165, 154, 128],
        [156, 152, 130],
        [159, 161, 142],
        ..., 
        [103,  93,  96],
        [123, 114, 120],
        [131, 121, 131]],

       [[163, 148, 120],
        [158, 148, 122],
        [163, 156, 133],
        ..., 
        [143, 133, 139],
        [143, 134, 142],
        [143, 133, 144]]], dtype=uint8)

In [11]:

import matplotlib.pyplot as matplot
matplot.imshow(t[1])

Out[11]:

<matplotlib.image.AxesImage at 0xe58e3f66d8>

As we have colour images, we need to convert them to gray scale to do further processing. A function is defined here to convert them to gray scale and are stored in a variable called ‘gray’.

In [12]:

def rgb2gray(rgb):
    return numpy.dot(rgb[...,:3], [0.299, 0.587, 0.114])

gray = []

for i in range(0,50000):
 gray.append(rgb2gray(t[i]))

As the image size is [32,32], we have 1024 pixels but we need them to be in a single row. So, we are reshaping the arrays from [32,32] to [1,1024] for further processing.

In [13]:

im_data = []
for i in range(0,50000): 
  data_row=gray[i]
  #pixels = matrix(as.numeric(data_row),16,16,byrow=TRUE)
  pixels = data_row
  im_data.append(pixels.reshape(1,1024))

Importing the required libraries for Model building

In [14]:

import statsmodels.formula.api as sm
from sklearn import svm
from sklearn.metrics import confusion_matrix 
newvie=numpy.array(im_data)
a = newvie.reshape(50000,1024)

If we take each pixel as a feature, then we’ll have 1024 features and 50000 observations which will take a very long time for processing. So we can go through this by two ways. One of them is to extract features from those images i.e., by using 1024 pixels we are extracting four intensity based features namely mean, variance, skewness, kurtsis. The other one is to do PCA and extract 50 most important pixels from 1024 and doing further analysis. Here, we are extracting the intensity based features.

In [15]:

mean_val = []
for i in range(0,50000):
    mean_val.append(numpy.mean(a[i]))

variance = []
for i in range(0,50000):
    variance.append(numpy.var(a[i]))

skewness = []
for i in range(0,50000):
    skewness.append(scipy.stats.skew(a[i]))
skewness = pd.DataFrame(skewness)

kurtosis = []
for i in range(0,50000):
    kurtosis.append(scipy.stats.kurtosis(a[i]))
kurtosis = pd.DataFrame(kurtosis)

Model Building

Support Vecotr Machines:

import time start_time = time.time() numbersvm1 = svm.SVC(kernel=’rbf’, C=1).fit(variable,train_lab.label) print(“— %s seconds —” % (time.time() – starttime)) len(numbersvm1.support)

here we are finding out the confusion matrix and doing bootstrap cross validation for the SVM model.

In [ ]:

predict = numbersvm.predict(variable)
conf_mat = confusion_matrix(train_lab.label,predict)
numpy.trace(conf_mat)/sum(sum(conf_mat))
from sklearn import cross_validation
####cross-validation
cv = cross_validation.ShuffleSplit(train_lab.label.size, n_iter=10,test_size=0.2, random_state=None)
scores = cross_validation.cross_val_score(numbersvm1,variable,train_lab.label,cv = cv)
score_mean = numpy.mean(scores)

The model accuracy after cross-validation is approximately 87%.

In [ ]:

anni = a[:40000]
tt = train_lab.label[:40000]
mean_val = pd.DataFrame(mean_val)
numbersvm = svm.SVC(kernel='rbf', C=1).fit(mean_val,train_lab.label)
konni = a[49000:]
at = train_lab.label[49000:]

Model Building 2

Decision Trees:

In decision trees, if we take intensity based features then the tree is not going to classify properly as there are very less features. So, we have taken all the 1024 pixels as variables and trained the model. Cross Validation is also done for this model.

In [ ]:

from sklearn import tree
clf = tree.DecisionTreeClassifier()
clf.fit(a,train_lab.label)
tr_score = clf.score(a,train_lab.label)
#ts_score = clf.score(variable,train_lab.label)

cv = cross_validation.ShuffleSplit(tt.size, n_iter=10,test_size=0.2, random_state=None)
scoresTree = cross_validation.cross_val_score(clf,a,train_lab.label,cv = cv)
score_mean = numpy.mean(scoresTree)
score_mean

The accuracy of the above model is around 20% as there is lot of redundancy and too many irrelevant features as well.

Model Building 3

Principal Component Analysis:

Here, Instead of taking all the 1024 values we will take 50 values by doing Principal Component Analysis. Then we build a model using decision trees as well as Random Forest.

In [16]:

from sklearn.decomposition import RandomizedPCA
n_components = 50
pca = RandomizedPCA(n_components=n_components, whiten=True).fit(a)
print("Projecting the input data on the eigenfaces orthonormal basis")
x_train_pca = pca.transform(a)

Projecting the input data on the eigenfaces orthonormal basis

Decision Tree:

In [19]:

from sklearn import tree
clf1 = tree.DecisionTreeClassifier()
clf1.fit(x_train_pca,train_lab.label)
tr_score1 = clf1.score(x_train_pca,train_lab.label)
#ts_score = clf.score(variable,train_lab.label)
from sklearn import cross_validation
cv = cross_validation.ShuffleSplit(train_lab.label.size, n_iter=10,test_size=0.2, random_state=None)
scoresTree1 = cross_validation.cross_val_score(clf1,x_train_pca,train_lab.label,cv = cv)
score_mean1 = numpy.mean(scoresTree1)
score_mean1

Out[19]:

0.23079999999999998

Random Forest:

In [22]:

###############Building a Randomforest classifier ########
from sklearn.ensemble import RandomForestClassifier
forest=RandomForestClassifier(n_estimators=50, criterion='gini', max_depth=None,
                              min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto',
                              max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=1, random_state=None, verbose=0,
                              warm_start=False, class_weight=None).fit(x_train_pca,train_lab.label)

cv = cross_validation.ShuffleSplit(train_lab.label.size, n_iter=10,test_size=0.2, random_state=None)
scoresTree2 = cross_validation.cross_val_score(forest,x_train_pca,train_lab.label,cv = cv)
score_mean2 = numpy.mean(scoresTree2)
score_mean2

Out[22]:

0.37836000000000003

Conclusion:

Among all the models built, Support vector machines is showing very good result when compared to the rest. But, SVM is taking a good amount of time to build the model.

Object Recognition in Images

Object recognition

Problem Statement:

Data Exploration:

Model Building

Support Vecotr Machines:

Model Building 2

Decision Trees:

Model Building 3

Principal Component Analysis:

Decision Tree:

Random Forest:

Conclusion: