Python for Data Analysts & Quants

Use Python and its statistical computing libraries to analyse and visualise your data and to gather some actionable insights.

Our Python for Data Analysts and Quants training course covers an introduction to the core concepts of the Python language and data science, ultimately focusing on big data analytics including how best to manipulate and visualise your data with Python's excellent library support.

This intensive course is intended for Data Scientists, Quants, Data Analysts and Business Intelligence experts who want to understand how to use Python in their data-oriented environment.

Practical exercises and interactive walk-throughs are used throughout, so attendees have the opportunity to apply the concepts on real data science applications, from exploratory data analysis to predictive analytics.

The Python Machine Learning syllabus is designed to help analysts, researchers, BI experts and developers becoming familiar with the implementation of Machine Learning solutions, through the use of tools in the Python programming language ecosystem.

Using a mix of frontal presentation and interactive examples, the course provides a comparison between Supervised and Unsupervised Learning, and offers an overview on core algorithms for predictive analytics, tackling tasks such as classification, clustering, regression analysis and dimensionality reduction. Notions of Neural Networks and latest developments in Deep Learning are also discussed.

What you will learn:

Learn core concepts of the Python environment, language and data science
Use Python to get your data in shape, and take advantage of its features and techniques to gain actionable Insights
Explore Python data science tools such as NumPy and Pandas
Explore the opportunity to apply the proposed concepts on real data science applications
Acquire knowledge on how to access and prepare data
Use data analysis to perform the computation of summary information and basic statistics
Use effective data visualisation techniques to help you with complex data structures
Learn about framing a business application as a Machine Learning task
Understand the role of labelled data, data cleaning and data transformation in Machine Learning systems
• Explore feature engineering techniques to extract useful attributes from your data
• Implement supervised / unsupervised learning algorithms using Python
• Evaluate the quality of your models, using evaluation metrics, model introspection and error analysis.

Audience

Quants, Data Scientists, Data Analysts, Financial Analysts, Business Intelligence experts who are new to Python.

Python developers who are new to Data Science or want to know more about the Python tools for Data Analysis.

Course outline:

Installation & Packaging

Installation, packaging and virtualisation of Python using Conda.
We'll set up Python using the Anaconda distribution, a free and enterprise-ready Python distribution that includes hundreds of the most popular Python packages for science, math, engineering and data analysis.
Anaconda comes with Conda, a cross-platform tool for managing packages and virtual environments. We'll also set up Jupyter, a web-based interactive environment where users can organise, write and run their Python code in notebooks.

Python Core Concets & Best Practices

Introduction to Python basic concepts, data structures and control flow structures.
Overview of how Python is used for Data Science and Data Analytics projects.
Notions of Object-Oriented Programming and Functional Programming, applied to the design of Python applications and analysis pipelines using best practices.
Core data types in Python
Control flow statements
Defining and using custom functions
The Python standard library
Working with data:
Iteration and list comprehensions
Accessing raw data on file (CSV, JSON, ...)
Working with dates and times
Object-Oriented Programming in Python

Python Data Science Tools

We'll explore the most important Python tools for Data Science.
NumPy, short for Numerical Python, is one of the main building blocks for scientific computing in Python.
It provides high speed manipulation of multi-dimensional arrays and it's used by higher level libraries (like pandas) to support sophisticated analytics with high speed computation.
Pandas is a highly performant library for data manipulation and data analysis in Python. It's built on top of NumPy and optimised for performance, while offering a high-level interface.
We'll discuss how to create and manipulate Series and DataFrame objects in pandas, accessing data from multiple sources, cleaning and transforming data sets to get them in the right shape for advanced analysis.

Accessing & Repairing Data

Data can come in multiple formats and from multiple sources. We'll examine how to read and write data from local files in different formats, and how to access data from remote source.
Data cleaning and data preparation are the first steps in a data analysis project, so we'll discuss how to perform data transformation to get ready for further analysis.

Data Analysis

With our data in the right shape, we're ready to analyse them in order to extract useful insights.
We'll perform the computation of summary information and basic statistics from data sets.
We'll approach split-apply-combine operations with Data Frames, in order to perform advanced transformations and reshaping our data with pandas.
We'll query our Data Frames using the powerful group-by method.

Data Visualization

Data analysis benefits from the visualisation of data. If a picture if worth a thousand words, complex data structures can be easier to understand and analyse using effective visualisation techniques.
Communicating the results with non-technical users is also a challenge that visualisation techniques help to overcome.

More Detail

Environment Set-up

The Anaconda distribution as Python Data Science platform
Overview on Python virtual environment set-up
Running code in Jupyter notebook

Python Data Science libraries

Numpy:

Working with NumPy arrays
Essential operations with NumPy arrays
Stats and linear algebra with NumPy

pandas:

Working with table-like data in pandas
Essential operations with Series and DataFrame object
Loading data from file into DataFrame objects
Summary statistics over DataFrame objects
Data aggregation queries (groupby() method)
Exploratory analysis of new datasets
Data visualisation over DataFrames
Join/merge operations with DataFrames
Working with text data in DataFrames

Databases:

Working with relational databases in Python
Overview on SQLAlchemy for database interaction
Integration of pandas and SQL

Miscellanea:

Python packaging: using and creating custom libraries
Unit testing: tools to perform unit testing in Python
Interaction with web services

Price:	20,900 NOK
Included items:	Digital course documentation and hands-on labs, lunch and refreshments for in class events only
Hours	09.00-16.00
Durance:	3 days
Languages	English