Skillset

A brief discription about my technical skills

Data Science and Machine Learning skills.

From the past 3 years I have been working with python, R-prog and Matlab development, building tools and models to analyse data and bring out hidden secrets out of it. Below are the few statistical techniques, Data Mining and Machine Learning techniques that I have implemented on various datasets and these programs are available on my github profile.

Statistical Techniques that I use in my day to day data science practice are :

1. Regression Analysis.

2. Correlation Analysis.

3. Hypothesis Testing.

4. Probability Distribution.

5. Sampling Techniques.

6. Dimensional Reductions.

Machine learning Algorithms that I use in my day to day data science practice are :

Supervised Learning - Classification Algorithms

1. Logistic Regression.

2. Naive Bayes.

3. Decision Trees.

4. K-Nearest Neighbours.

5. Random Forest.

6. Support Vector Machines.

7. Gradient Boosting.

8. MultiLable classification.

Unsupervised Learning :-

1. Clustering algorithms - KNN, Kmeans, Fuzzy Cmeans, GKFCM.

2. Association Analysis - Apriori.

Neural Networks :-

1.Tensorflow Keras - CNN's and RNN's.

2. Transfer Learning, GAN's.

All the above mentioned Machine Learning and data mining techniques are implemeted in atleast one of my academic projects, self taken projects, online courses etc. during my masters degree. Some of the examples of my implementation can be found at my github repository.

Artificial Intelligence and Computational Intelligence

1. Natural Language Processing : Tokenization, Lemmatization & Stemming, Stopword removing, Relationship extraction, Text Extraction - TF-IDF.

2. Gentic Algorithm.

3. Particle swarm optimization.

Please refer to home page for projects related to the above skills and quick links menu to look at my github profile where most of these techniques are implemented.

The things that I am learning constantly

The most importantly I am practicing creating machine learning piplines, data engineering concepts, SAS programming, Java and Javascript. I have recently started to build Recommendation engines in IBM watson studio which will be out shortly.


Disaster Pipeline - ML and NLP

The message dataset contains pre labeled messages from real life disaster events. This Project builds a Naural Language Processing (NLP) with Multioutput classifier pipeline model to categorize the messages.

Airbnb Seattle Analysis

This project provides the key insights from the Airbnb Seattle dataset using the CRISP-DM approach.

Open source contribution

This Package will implement dummy columns for a column with multiple values.The output will be the dataframe with all the dummy columns attached to the original dataframe and the original column with multiple values will be dropped.