Skip to Content

More resources

There are many useful Python resources on the internet, including documentation for the most common scientific packages, and text books written for university level courses using Python. This page attempts to list some of those resources.

Package Documentation

  • numpy - The NumPy (Numerical Python) is the core package used in Python for multi-dimensional arrays of data. Almost every other data aware package will use numpy internally.
  • scipy - The SciPy (Scientific Python) is the core scientific package for Python. It contains a large number of core functions and low level computational methods (FFTs, interpolation, integration), but more importantly it provides a standard interface that many other packages use. Some components are available in packages with more options or newer algorithms, but scipy contains the standard methods.
  • pandas - Pandas is a wrapper around numpy arrays that provides a more human oriented interface, making it easier to aggregate data, combine arrays, and fill in missing data. Pandas also provides some plotting functions that attempts to automatically label the plot properly, and input/output routines that format the files for readability.
  • seaborn - Seaborn is a wrapper around the matplotlib library, providing a cleaner interface to many plotting routines, and a better management of the plot style. It is aimed at statistical visualization, and it can plot distrbutions and error characteristics. Seaborn doesn't contain much functionallity to analyze your data
  • statsmodels - Stats Models contains a large number of statistical models to fit data to parametric models. If you are familiar with the R language, statsmodels uses a similar interface to fitting routines.
  • Pypi - The Python Package Index contains a list of most publicly available Python packages. If you need a Python package to perform some analysis, read in a particular file format, or write a web server, check PyPi for an existing package.
  • Anaconda Cloud - An alternative package index. PyPi works for most packages, but some code requires external libraries. These are often easier to use from Anaconda. The most popular packages on PyPi will also exist on Anaconda

Books

  • Python Data Science Handbook - A textbook describing the use of Python for "Data Science", which covers any activity that involves data. The current version of the book is available online for free, and includes a chapter on numpy, pandas, matplotlib, and machine learning (regression, model fitting).
  • Computational Physics with Python - This textbook has been used in 4th year courses at the University of Toronto. It contains a brief introduction to Python (in chapters available online) and explores numerical algorithms for integration, linear and non-linear systems, Fourier transforms, PDEs, etc.
  • Think Stats - A textbook covering data exploration and probability using Python. If you have a dataset but you're not sure how to get the most information out of it, this book will help guide your data analysis.
  • Python for data analysis - The companion textbook for Pandas. Not available online, but useful if you want to learn more about pandas.

Websites

(not specifically related to a book)