• Main
  • Machine Learning For Data Informatics(INF-552)
  • Generative Model For Text(LSTM) on Bertrand Russell's work
      • Built a generative model to mimic the writing style of prominent British mathematician, philosopher, prolific writer, and political activist, Bertrand Russell.Trained a LSTM that mimics Russel's style and thoughts.

      • Dataset
      • Code
  • (Deep) CNNs for Image Colorization
      • Used a convolutional neural network for image colorization which turns a grayscale image to a colored image. By converting an image to grayscale, we loose color information, so converting a grayscale image back to a colored version is not an easy job. I used the CIFAR-10 dataset.

      • Tools Used

        scikit-learn,pandas,keras,tenseorflow

      • Dataset
      • Code
  • Multiclass and multi label classification using Support Vector Machines
      • Multi-class and Multi-Label Classification Using Support Vector Machines on Anuran Calls dataset and K-Means Clustering on a Multi-Class and Multi-Label Data Set
        Approaches:Monte Carlo simulation, using linear, Gaussian kernels kernel and L1-penalized SVMs, SMOTE, CH or Gap Statistics or screen plots

      • Dataset
      • Code
  • Supervised, Semi-supervised and unsupervised learning on Breast Cancer Dataset
      • Supervised, Semi-supervised and unsupervised learning on Breast Cancer Dataset
        Approaches:Monte Carlo simulation, supervised, Clustering,Spectral Clustering, Self Training and L1-penalized SVMs, SMOTE, CH or Gap Statistics or screen plots

      • Dataset
      • Code
  • Active Learning Using Support Vector Machines
      • Binary classification problem on banknote authentication dataset

        Approaches:

        Active, passive learning, Monte Carlo simulation, using linear, Gaussian kernels kernel and L1-penalized SVMs, SMOTE, CH or Gap Statistics or scree plots
      • Dataset
      • Code
  • The LASSO and Boosting for Regression, Tree-Based Methods
      • Communities and Crime dataset, APS Failure dataset
        Approaches: Data imputation techniques, linear,ridge regression, PCR models , boosting tree, multivariate regression tree, L1 penalized gradient boosting tree, XGBoost, random forest, Out of Bag error estimate, ROC, AUC, Weka, SMOTE

      • Dataset(Crime)
      • Dataset(APS Failure)
      • Code
  • Time Series Classification
      • An interesting task in machine learning is classification of time series. In this problem, I tried to classify the activities of humans based on time series obtained by a Wireless Sensor Network.
        Approaches: Time-domain features, bootsrap confidence interval, binary classification Using Logistic Regression, p-values, backward selection using sklearn.feature selection, stratified cross validation, Python's Recursive Feature Elimination, ROC, AUC, L1-penalized logistic regression, L1 regularization, L1- penalized multinomial regression, Naive Bayes' classifier using both Gaussian and Multinomial priors

      • Dataset
      • Code
  • Cycle power plant dataset
      • cycle power plant dataset(arem)
        Approaches: Scatterplots, box plots, Classification using KNN, Learning curve, Euclidean, Minkowski, Manhattan, Chebyshev, Mahalanobis distances, simple linear regression model, association of interactions of predictors with the response using p-values, KNN Regression.

      • Dataset
      • Code
  • Classification using verterbral column daatset
      • cycle power plant dataset(arem)
        Approaches: Scatterplots, box plots, Classification using KNN, Learning curve, Euclidean, Minkowski, Manhattan, Chebyshev, Mahalanobis distances, simple linear regression model, association of interactions of predictors with the response using p-values, KNN Regression.

      • Dataset
      • Code

© 2019 Sumit Parwal. All rights reserved.