Home

Welcome to the Data-Science-on-the-palms wiki!

Before the end of the project, the wiki-page will be devoted to a visual demonstration of the development of the project. Below is a list of sequences of actions that will gradually be implemented. Each stage will be accompanied by a brief description of the essence of the stage and possible ideas that I would like to implement.

Сompiling a table of contents for the entire project
Editing README.md, adding a verbal description of sections and subsections

Sequential content according to the table of contents

3.1. Tools

3.1.1 Python
   a) Basic knowledge (syntax, variables, loops, I/O…)
   b) Data Structures
   c) OOP (Object Oriented Programming)
   d) Standard libraries
   e) Exceptions*
   f) Tests*
3.1.2. Mathematics knowledge
   a) Probability theory
   b) Statistics
3.1.3 Libraries
   a) Pandas
   b) Matplotlib
   c) NumPy
   d) Scikit-learn
3.1.4 Databases
   a) SQL (language)
   b) MySQL, PostgreSQL

3.2 Data Collection

3.2.1 Think before you start
3.2.2 Work with different data types
3.2.3 Reading from the common data storage systems
3.2.4 Types of secondary data sources

3.3 Data Preprocessing

3.3.1 Data cleaning
3.3.2 Data integration
3.3.3 Data transformation
3.3.4 Data reduction
3.3.5 Data discretization

3.4 Model Training

3.5 Model Evaluation

Tools

Python

Basic knowledge (syntax, variables, loops, I/O…)

Data Structures

OOP (Object Oriented Programming)

Standard libraries

Exceptions*

Tests*

Mathematics knowledge

Probability theory

Statistics

Libraries

Pandas

(1) Add another hot commands to the file ‘pandas hot commands’; (2) Add some description and create links to documented desciption of the commands for faster access; (3) Add file ‘Pandas basic opportunities’ with a brief Into to this library and demostration on the real data;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Tools

Python

Basic knowledge (syntax, variables, loops, I/O…)

Data Structures

OOP (Object Oriented Programming)

Standard libraries

Exceptions*

Tests*

Mathematics knowledge

Probability theory

Statistics

Libraries

Pandas

Matplotlib

NumPy

Scikit-learn

Databases

SQL (language)

MySQL, PostgreSQL

Data Collection

Think before you start

Work with different data types

Reading from the common data storage systems

Types of secondary data sources

Data Preprocessing

Data cleaning

Data integration

Data transformation

Data reduction

Data discretization

Model Training

Model Evaluation

Clone this wiki locally