Skip to content
Denis Neustroev edited this page Dec 15, 2018 · 2 revisions

Welcome to the Data-Science-on-the-palms wiki!

Before the end of the project, the wiki-page will be devoted to a visual demonstration of the development of the project. Below is a list of sequences of actions that will gradually be implemented. Each stage will be accompanied by a brief description of the essence of the stage and possible ideas that I would like to implement.

  1. Сompiling a table of contents for the entire project

  2. Editing README.md, adding a verbal description of sections and subsections

  3. Sequential content according to the table of contents

    3.1. Tools

    3.1.1 Python
       a) Basic knowledge (syntax, variables, loops, I/O…)
       b) Data Structures
       c) OOP (Object Oriented Programming)
       d) Standard libraries
       e) Exceptions*
       f) Tests*
    3.1.2. Mathematics knowledge
       a) Probability theory
       b) Statistics
    3.1.3 Libraries
       a) Pandas
       b) Matplotlib
       c) NumPy
       d) Scikit-learn
    3.1.4 Databases
       a) SQL (language)
       b) MySQL, PostgreSQL

    3.2 Data Collection

    3.2.1 Think before you start
    3.2.2 Work with different data types
    3.2.3 Reading from the common data storage systems
    3.2.4 Types of secondary data sources

    3.3 Data Preprocessing

    3.3.1 Data cleaning
    3.3.2 Data integration
    3.3.3 Data transformation
    3.3.4 Data reduction
    3.3.5 Data discretization

    3.4 Model Training

    3.5 Model Evaluation

(1) Add another hot commands to the file ‘pandas hot commands’; (2) Add some description and create links to documented desciption of the commands for faster access; (3) Add file ‘Pandas basic opportunities’ with a brief Into to this library and demostration on the real data;