Python based Machine Learning and Data Science Tools | S-Logix

Python Development Tools and Platforms for Machine Learning and Data Science Projects

Software Requirements

Operating System : Ubuntu 14.04 LTS / Windows
IDE: Spyder /PyCharm
Databases: PostgreSQL / MySQL / SQLite

S.No.	Python Libraries	Type	Description
1	NumPy	Numerical Operations	Supporting the scientific computing that is high-level mathematical functions over large, multi-dimensional arrays and matrices
2	Matplotlib, Seaborn, Bokeh, Plotly, NetworkX, Basemap,d3py, ggplot, prettyplotlib	Visualization	Visualizing the data from Python quickly Plotting 2D graphs in various formats such as bar charts, plots, histograms, error charts, power spectra, and scatter plots across platforms using a few lines of code
3	Scikit- Learn, Shogun, Pattern, PyLearn2, PyMC	Machine Learning (ML) Algorithms	Providing machine learning algorithms such as classification, clustering, and regression Interoperating with the numerical and scientific libraries such as NumPy and SciPy
4	Pandas	Data Analysis	Offering high-performance operations and data structures for time series and numerical tables manipulation
5	NLTK	Natural Language Processing (NLP)	Analyzing and understanding the English written human language data Providing easy interfaces over 50 lexical resources and corpora Supporting functionalities include tokenization, stemming, tagging, parsing, and semantic reasoning
6	Statsmodel	Statistical Analysis	Conducting statistical data exploration and statistical tests Performing statistical computations such as descriptive statistics and providing classes and functions to estimate different statistical models
7	PyBrain	Neural Network	Providing algorithms for reinforcement learning, neural networks, unsupervised learning, and evolution to analyze large-scale data
8	Gensim	Topic Modeling	Supporting natural language processing and unsupervised topic modeling through statistical machine learning Supporting automatic extraction of semantic topics from documents
9	Keras, TensorFlow, Theano	Deep Learning	Providing fast computing of numerical data with deep neural networks Effectively handling mathematical expressions, especially, matrix values
10	Scrapy	Web Crawling	Extracting the required data from the websites in a simple and fast way
11	SciPy, Dask, Numba, HPAT, Cython	Data Science tools	Performing scientific computing involves special functions, integration, linear algebra, optimization, interpolation, Ordinary Differential Equation (ODE) solvers, Fast Fourier Transform (FFT), and image processing Optimizing the machine code at runtime
12	HDF5	Data manipulation	Enabling the storage of huge amounts of numerical data and manipulating the data easily from NumPy
13	SymPy	Statistical Applications	Supporting symbolic mathematics and modeling the full-featured Computer Algebra System (CAS)
14	csvkit, PyTables, SQLite3	Storage and Data Formatting	Converting to CSV formats from different file formats such as JSON and Excel and working with CSV Managing hierarchical datasets and accessing the large-scale databases
15	Cryptography, pyOpenSSL, passlib, requests- oauthlib, ecdsa, PyCryptodome, service- identity	Security	Providing low-level primitives Supporting extensive error handling and providing cryptographic authority Reporting bugs and hashing the data
16	NumPy, SciPy, matplotlib, OpenCV, scikit-learn, scikit-image, ilastik	Image Processing	Providing a set of algorithms for image processing Supporting geometric transformations, segmentation, filtering, color space manipulation, morphology, analysis, and feature detection