[vc_row][vc_column][vc_column_text]

R-Programming with Machine Learning and Data Science Tools

[/vc_column_text][/vc_column][/vc_row][vc_row][vc_column css=”.vc_custom_1456723760841{border-top-width: 2px !important;border-right-width: 2px !important;border-bottom-width: 2px !important;border-left-width: 2px !important;padding-top: 20px !important;padding-right: 20px !important;padding-bottom: 20px !important;padding-left: 20px !important;background-color: #eeeeee !important;border-left-color: #467db1 !important;border-left-style: solid !important;border-right-color: #467db1 !important;border-right-style: solid !important;border-top-color: #467db1 !important;border-top-style: solid !important;border-bottom-color: #467db1 !important;border-bottom-style: solid !important;}” el_class=”project_short_document_div”][vc_empty_space height=”0px”][ult_tab_element tab_style=”Style_2″ tab_background_color=”#467db1″ tab_hover_background_color=”#467db1″ acttab_background=”rgba(70,125,177,0.92)” tab_describe_color=”#262626″ enable_bg_color=”#ededed” container_border_style1=”border-style:solid;|border-width:1px;border-radius:0px;|border-color:#cccccc;”][single_tab title=”Software Requirements” tab_id=”1456231644788-0-6″][vc_column_text]

  • Operating systems: Ubuntu 14.04 LTS / Windows

  • IDE: R Studio

  • Databases: PostgreSQL / MySQL / SQLite

[/vc_column_text][/single_tab][/ult_tab_element][vc_empty_space height=”35px”][vc_column_text]

S. No. Libraries in R Type Description
1 ggplot2, googleVis, corrplot, lattice, ggfortify,ggrepel, ggalt, ggtree,ggtech, ggplot2 Extensions, rgl, Cairo,extrafont, showtext,animation, gganimate, misc3D, xkcd,imager,hrbrthemes, waffle, dendextend, r2d3, Patchwork Data Visualization Visualizing the graphs with the scales and layers, combining multiple plots, and visualizing the complex position of the plots using mathematical operators
2 plyr,dplyr, data.table, lubridate, reshape2,readr, haven, tidyr, broom, rlist, jsonlite, ff, stringi,stringr, bigmemory, fuzzyjoin, tidyverse Data Manipulation Supporting consistent, fast, and portable text processing and handling the complex data formats such as data-time and time-spans
3 MissForest Missing Value Imputations Imputing the mixed type of data such as continuous and/or categorical data in parallel manner
4 MissMDA Missing Value Imputations Handling missing values over large and complex datasets with multivariate analysis
5 Outliers Outlier Detection Providing a set of tests and functions to detect outliers
6 Extreme Values in R (EVIR) Outlier Detection Estimating extreme quantiles using several functions such as block maxima, exploratory data analysis, peak over thresholds, gev/gpd distributions, and point processes
7 Features Feature Selection Extracting the features such as mean value, local maxima and minima, first and second derivatives, noise and so on from discretely-sampled functional data
8 Regularized Random Forest (RRF) Feature Selection Selecting the features based on the random forest
9 FactoMineR Feature Selection Providing exploratory data analysis methods include Principal Component Analysis (PCA), Correspondence Analysis (CA), Multiple Correspondence Analysis (MCA), hierarchical cluster analysis, and multiple factor analysis
10 Canonical Correlation Analysis (CCA) Feature Selection Performing significance test such as Monte Carlo and asymptotic tests
11 Companion to Applied Regression (CAR) Continuous regression Making type II and type III Anova tables using its Anova function
12 RandomForest Classification, Regression Creating a large number of decision trees for regression and classification and assessing proximities among the data values in unsupervised model
13 RMiner Ordinal regression Supporting the process of data mining classification and regression methods
14 CoreLearn Ordinal regression Providing a set of classification, regression, and feature evaluation methods to process the dataset having ordinal features
15 Classification And REgression Training (Caret) Classification, Regression Creating the predictive models and optimizing the process through a set of functions
16 BigRF Classification, Regression Handling a very large datasets using random forest algorithms Building multiple random forests in parallel to effectively process too large datasets
17 Clustering for Business Analytics (CBA) Clustering Manipulates data and performs efficient computation of cross distances with the help of Proximus and rock, and utility functions
18 RankCluster Clustering Ranking multivariate data through model-based clustering
19 forecast Time Series Forecasting from time series models or time series based on the class of the first argument
20 Linear Time Series Analysis (LTSA) Time Series Modeling linear time series for simulation, forecasting, and loglikelihood computation
21 survival Survival Analysis Predicting the time at which the occurrence of a particular event by creating survival object among the variables
22 Basta Survival Analysis Estimating the unknown birth and death times, survival trends, and age-specific mortality through multiple Markov Chain Monte Carlo (MCMC) simulations for large number of records having unknown birth and death times
23 Least-Squares Means (LSMeans) General Model Validation Computing least-squares means for many generalized linear, linear, and mixed models
24 Comparison General Model Validation Comparing a model object with the comparison object for validation
25 RegTest Regression Validation Conducting regression test for funnel plot asymmetry for ‘Rma’ objects
26 ACD Regression Validation Analyzing categorical data with missing or complete responses
27 BinomTools Classification Validation Performing diagnostics for binomial regression models using a set of diagnostic methods
28 DAIM Classification Validation Evaluating the classification accuracy through performance measures include sensitivity, AUC, specificity, bootstrap estimation, and repeated k-fold cross validation
29 ClustEval Clustering Validation Evaluating the clustering, individual clusters, and clustering algorithms
30 SigClust Clustering Validation Assessing the significance of the clustering algorithms using statistical method
31 PROC Clustering Validation Computing confidence interval for partial Receiver Operating Characteristic (ROC) curves based on the comparison with statistical tests
32 TimeROC Clustering Validation Estimating dynamic or cumulative time-dependent ROC curve
33 plotly, ggvis, DataTables, rCharts, heatmaply,d3heatmap, DiagrammeR, dygraphs, formattable, Leaflet,

MetricsGraphics, networkD3, scatterD3, rbokeh, threejs,

timevis, visNetwork, wordcloud2, highcharter

HTML Widgets Providing interface to visualize the data in the form of plotsOffering numerous chart types with a simple syntax
34 knitr, rmarkdown, slidify, tinytex,xtable, rapport, Sweave, texreg, checkpoint, brew,ReporteRs, bookdown, ezknitr, drake Reproducible research Supporting the conversion of various formats and reproducible report templates
35 mlr Machine learning Providing a set of classification and regression techniquesComprising generic resampling, filter and wrapper methods, hyper parameter tuning methods and so on
36 eXtreme Gradient Boosting package (Xgboost) Learning and Prediction Supporting, regression, classification, and ranking objective functions
37 gbm Regression Methods Supporting generalized boosted regression modeling and performing an optimal number of iterations through out-of-BA estimator
38 Prophet Time Series Forecasting time series data based on the non-linear trends and handling outliers, missing data, and shifts in trends
39 Quality Control Chart (QCC) Quality Control Plotting Opearional Characteristic (OC) curve, Pareto chart, multivariate charts, cause-and-effect chart, and shewhart chart for attribute, count, and continuous data
40 shiny, shinyjs, RCurl, curl,httr, httpuv, XML, rvest, OpenCPU, Rfacebook,RSiteCatalyst, plumber Web technologies and Services Providing interface to client for easily accessing web pages
41 Parallel, Rmpi, future, SparkR,DistributedR, ddR, sparklyr, batchtools Parallel Computing Providing parallel and interactive computing environment
42 Rcpp, Rcpp11, compiler High performance Providing integration between different programming languages
43 rJava, jvmr, rJython, rPython,runr, RJulia, JuliaCall, RinRuby, R.matlab,RcppOctave, RSPerl, V8, htmlwidgets, rpy2 Language API Providing interface to other programming languages
44 RODBC, DBI, elastic, mongolite,odbc, RMariaDB, RMySQL, ROracle, RPostgreSQL,RSQLite, RJDBC, rmongodb, rredis, RCassandra,RHive, RNeo4j, rpostgis Database Management Providing interface for accessing the database
45 AnomalyDetection, ahaz, arules, bigrf, bigRR,bmrm, Boruta, BreakoutDetection, bst, CausalImpact, C50,caret, CORElearn, CoxBoost, Cubist, e1071, earth,elasticnet, ElemStatLearn, evtree, forecast, forecastHybrid,prophet, FSelector, frbs, GAMBoost, gamboostLSS, gbm,glmnet, glmpath, GMMBoost, grplasso, grpreg, h2o, hda,

ipred, kernlab, klaR, kohonen, lars, lasso2, LiblineaR,ime4, LogicReg, maptree, mboost, mlr, mvpart, MXNet, ncvreg,

nnet, oblique.tree, pamr, party, party.kit, penalized,penalizedLDA, penalizedSVM, quantregForest, randomForest,

randomSRC, ranger, rattle, rda, rdetools, REEMtree, relaxo,rgenoud, rgp, Rmalschains, rminer, ROCR, RoughSets, rpart,

RPMM, RSNNS, Rsomoclu, RWeka, RXshrink, sda, SDDA, SuperLearner,subsemble, svmpath, tgp, tree, varSelRef, xgboost

ML Learning high dimensional and large-scale dataAnalyzing, manipulating, and representing the patterns and transaction data
46 text2vec, tm, openNLP, koRpus, zipfR, NLP,LDAvis, topicmodels, syuzhet, SnowballC, quanteda, MonkeyLearn,tidytext, utf8 Natural Language Processing Analyzing a set of documents using text mining toolsSupporting natural language text processing in different languages
47 coda, mcmc, MCMCpack, R2WinBUGS, BRugs, rjags, rstan Bayesian Providing interface for bayesian analysis
48 IpSolve, minqa, nloptr, ompr, Rglpk, ROL Optimization Resolving optimization problems include integer, linear, mixed integer, transportation, and assignment problems
49 qantmod, TTR, PerformanceAnalytics, zoo, xts, tseries,fAssets Finance Building technical trading rulesBuilding, trading, and analyzing quantitative financial trading strategies
50 Bioconductor, genetics, gap, ape, pheatmap Bioinformatics and Biostatistics Offering control over appearance and dimensionsAnalyzing genetic data and evolution
51 Igraph, network, sna, netdiffuseR, networkDynamic,ndtv, statnet, ergm, latentnet, tnet, rgext, visNetwork Network Analysis Visualizing the network data and handling the large graphs efficiently through statistical analysis
52 magick, imager Image Processing Supporting different image manipulations and a variety of image formatsProcessing the images up to four dimensions in a fast manner

[/vc_column_text][/vc_column][/vc_row]

Leave Comment

Your email address will not be published. Required fields are marked *

clear formSubmit