# free-data-science-books **Repository Path**: peterT/free-data-science-books ## Basic Information - **Project Name**: free-data-science-books - **Description**: Free resources for learning data science - **Primary Language**: Unknown - **License**: Unlicense - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2025-09-24 - **Last Updated**: 2025-09-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README List of Data Science/Big Data Resources ====================== This list contains free learning resources for data science and big data related concepts, techniques, and applications. Inspired by [Free Programming Books](https://github.com/vhf/free-programming-books). Each entry provides the expected audience for the certain book (beginner, intermediate, or veteran). It may be subjective, but it provides some clue of how difficult the book is. ### How To Contribute - Fork - Edit, and add your recommendations (for beginner, intermediate, or veteran) - Send a Pull Request ### Index * [Data Science Introduction](#data-science-introduction) * [Data Processing](#big-data-processing) * [Data Analysis](#big-data-analysis) * [Fundamentals](#fundamentals) * [Network Analysis](#network-analysis) * [Statistics](#statistics) * [Data Mining](#data-mining) * [Machine Learning](#machine-learning) * [Data Science Application](#big-data-application) * [Data Visualization](#data-visualization) * [Uncategorized](#uncategorized) * [MOOCs about Data Science](#moocs) ### Data Science Introduction * [Data Science: An Introduction](http://en.wikibooks.org/wiki/Data_Science:_An_Introduction) - Wikibook - `Beginner` * [Disruptive Possibilities: How Big Data Changes Everything](http://www.amazon.com/Disruptive-Possibilities-Data-Changes-Everything-ebook/dp/B00CLH387W) - Jeffrey Needham - `Beginner` * [Introduction to Data Science](http://jsresearch.net/) - Jeffery Stanton - `Beginner` * [Real-Time Big Data Analytics: Emerging Architecture](http://www.amazon.com/Real-Time-Big-Data-Analytics-Architecture-ebook/dp/B00DO33RSW) - Mike Barlow - `Beginner` * [The Evolution of Data Products](http://www.amazon.com/The-Evolution-Data-Products-ebook/dp/B005QEKQUY/ref=sr_1_63?s=digital-text&ie=UTF8&qid=1351898530&sr=1-63) - Mike Loukides - `Beginner` * [The Promise and Peril of Big Data](http://www.aspeninstitute.org/sites/default/files/content/docs/pubs/The_Promise_and_Peril_of_Big_Data.pdf) - David Bollier - `Beginner` ### Data Processing * [Data-Intensive Text Processing with MapReduce](http://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf) - Jimmy Lin and Chris Dyer - `Intermediate` ### Data Analysis #### Fundamentals * [Fundamental Numerical Methods and Data Analysis](http://ads.harvard.edu/books/1990fnmd.book/) - George W. Collins - `Beginner` * [Introduction to Metadata](http://www.getty.edu/research/publications/electronic_publications/intrometadata/index.html) - Murtha Baca - `Beginner` * [Introduction to R - Notes on R: A Programming Environment for Data Analysis and Graphics](http://cran.r-project.org/doc/manuals/R-intro.pdf) - W. N. Venables, D. M. Smith, and the R Core Team - `Beginner` * [Modeling with Data: Tools and Techniques for Scientific Computing](http://modelingwithdata.org/about_the_book.html) - Ben Klemens - `Beginner` * [R for Data Science: Import, Tidy, Transform, Visualize, and Model Data](http://r4ds.had.co.nz/) - Hadley Wickham & Garrett Grolemund - `Beginner` - [Advanced R](http://adv-r.had.co.nz/) - Hadley Wickham - `Intermediate` #### Network Analysis * [Introduction to Social Network Methods](http://faculty.ucr.edu/~hanneman/nettext/) - Robert A. Hanneman and Mark Riddle - `Intermediate` * [Networks, Crowds, and Markets: Reasoning About a Highly Connected World](http://www.cs.cornell.edu/home/kleinber/networks-book/) - David Easley and Jon Kleinberg - `Intermediate` * [Network Science](http://barabasilab.neu.edu/networksciencebook/downlPDF.html) - Sarah Morrison - `Beginner` * [The Wealth of Networks](http://www.benkler.org/Benkler_Wealth_Of_Networks.pdf) - Yochai Benkler - `Beginner` #### Statistics * [Advanced Data Analysis from an Elementary Point of View](http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ADAfaEPoV.pdf) - Cosma Rohilla Shalizi - `Veteran` * [An Introduction to R](http://cran.r-project.org/doc/manuals/R-intro.pdf) - W. N. Venables, D. M. Smith, and the R Core Team - `Beginner` * [Analyzing Linguistic Data: a practical introduction to statistics](http://www.ualberta.ca/~baayen/publications/baayenCUPstats.pdf) - R. H. Baayan - `Beginner` * [Applied Data Science](http://columbia-applied-data-science.github.io/appdatasci.pdf) - Ian Langmore and Daniel Krasner - `Intermediate` * [Concepts and Applications of Inferential Statistics](http://vassarstats.net/textbook/) - Richard Lowry - `Beginner` * [Forecasting: Principles and Practice](https://www.otexts.org/fpp/) - Rob J. Hyndman and George Athanasopoulos - `Intermediate` * [Introduction to Probability](http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/pdf.html) - Charles M. Grinstead and J. Laurie Snell - `Beginner` * [Introduction to Statistical Thought](http://www.math.umass.edu/~lavine/Book/book.pdf) - Michael Lavine - `Beginner` * [OpenIntro Statistics - Second Edition](http://www.openintro.org/stat/textbook.php) - David M. Diez, Christopher D. Barr, and Mine Cetinkaya-Rundel - `Beginner` * [simpleR - Using R for Introductory Statistics](http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf) - John Verzani - `Beginner` * [Statistics](http://upload.wikimedia.org/wikipedia/commons/8/82/Statistics.pdf) - `Beginner` * [Think Stats: Probability and Statistics for Programmers v2.0](http://greenteapress.com/thinkstats2/thinkstats2.pdf) - Allen B. Downey - `Beginner` * [Computer Age Statistical Inference: Algorithms, Evidence and Data Science](https://web.stanford.edu/~hastie/CASI/) - Bradley Efron and Trevor Hastie - `Intermediate` #### Data Mining * [Data Mining and Analysis: Fundamental Concepts and Algorithms](https://repo.palkeo.com/algo/information-retrieval/Data%20mining%20and%20analysis.pdf) - Mohammed J. Zaki and Wagner Meira Jr. - `Intermediate` * [Data Mining and Knowledge Discovery in Real Life Applications](http://www.intechopen.com/books/data_mining_and_knowledge_discovery_in_real_life_applications) - Julio Ponce and Adem Karahoca - `Beginner` * [Data Mining for Social Network Data](http://link.springer.com/book/10.1007%2F978-1-4419-6287-4) - Springer - `Veteran` * [Mining of Massive Datasets](http://infolab.stanford.edu/~ullman/mmds/book.pdf) - Anand Rajaraman, Jure Leskovec, and Jeffrey D. Ullman - `Intermediate` * [Knowledge-Oriented Applications in Data Mining](http://www.intechopen.com/books/knowledge-oriented-applications-in-data-mining) - Kimito Funatsu - `Intermediate` * [New Fundamental Technologies in Data Mining](http://www.intechopen.com/books/new-fundamental-technologies-in-data-mining) - Kimito Funatsu - `Intermediate` * [R and Data Mining: Examples and Case Studies](http://cran.r-project.org/doc/contrib/Zhao_R_and_data_mining.pdf) - Yanchang Zhao - `Beginner` * [The Elements of Statistical Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/) - Trevor Hastie, Robert Tibshirani, and Jerome Friedman - `Intermediate` * [Theory and Applications for Advanced Text Mining](http://www.intechopen.com/books/theory-and-applications-for-advanced-text-mining) - Shigeaki Sakurai - `Intermediate` #### Machine Learning * [A Course in Machine Learning](http://ciml.info/) - Hal Daume - `Beginner` * [A First Encounter with Machine Learning](https://www.ics.uci.edu/~welling/teaching/273ASpring10/IntroMLBook.pdf) - Max Welling - `Beginner` * [Bayesian Reasoning and Machine Learning](http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/031013.pdf) - David Barber - `Veteran` * [Gaussian Processes for Machine Learning](http://www.gaussianprocess.org/gpml/chapters/) - Carl Edward Rasmussen and Christopher K. I. Williams - `Veteran` * [Introduction to Machine Learning](http://alex.smola.org/drafts/thebook.pdf) - Alex Smola and S.V.N. Vishwanathan - `Intermediate` * [Probabilistic Programming & Bayesian Methods for Hackers](http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/) - Cam Davidson-Pilon (main author) - `Intermediate` * [The LION Way: Machine Learning plus Intelligent Optimization](http://www.lionsolver.com/LIONbook/) - Robert Battiti and Mauro Brunato - `Intermediate` * [Thinking Bayes](http://www.greenteapress.com/thinkbayes/) - Allen B. Downey - `Beginner` * [Sklearn Basics](http://nbviewer.ipython.org/github/jakevdp/sklearn_scipy2013/tree/master/notebooks/) - `Beginner` * [Deep Learning](http://www.deeplearningbook.org) - Ian Goodfellow, Yoshua Bengio and Aaron Courville - `Intermediate` ### Data Science Application #### Information Retrieval * [Introduction to Information Retrival](http://nlp.stanford.edu/IR-book/) - Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze - `Intermediate` #### Data Visualization * [Interactive Data Visualization for the Web](http://chimera.labs.oreilly.com/books/1230000000345/index.html) - Scott Murray - `Beginner` * [Plotting and Visualization in Python](http://nbviewer.ipython.org/urls/gist.github.com/fonnesbeck/5850463/raw/a29d9ffb863bfab09ff6c1fc853e1d5bf69fe3e4/3.+Plotting+and+Visualization.ipynb) - `Beginner` * [ggplot2: Elegant Graphics for Data Analysis](https://github.com/hadley/ggplot2-book) - Hadley Wickham - `Beginner` ### Uncategorized * [Data Journalism Handbook](http://datajournalismhandbook.org/1.0/en/) - Jonathan Gray, Liliana Bounegru, and Lucy Chambers - `Beginner` * [Building Data Science Teams](http://assets.en.oreilly.com/1/eventseries/23/Building-Data-Science-Teams.pdf) - DJ Patil - `Beginner` * [Information Theory, Inference, and Learning Algorithms](http://www.inference.phy.cam.ac.uk/itprnn/book.html) - David MacKay - `Intermediate` * [Mathematics for Computer Science](http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/readings/MIT6_042JF10_notes.pdf) - Eric Lehman, Thomas Leighton, and Albert R. Meyer - `Beginner` * [The Field Guide to Data Science](http://www.boozallen.com/media/file/The-Field-Guide-to-Data-Science.pdf) - `Beginner` ### MOOCs about Data Science * [Data Mining with Weka](http://www.cs.waikato.ac.nz/ml/weka/mooc/dataminingwithweka/) - Ian H. Witten - `Intermediate` * [Mining Massive Datasets](https://class.coursera.org/mmds-002) - Jeff Ullman, Jure Leskovec, Anand Rajaraman (Coursera) - `Beginner` * [Introduction to Data Science](https://class.coursera.org/datasci-001/class) - Bill Howe (Coursera) - `Beginner` * [Introduction to Hadoop and MapReduce](https://www.udacity.com/course/ud617) - Udacity - `Beginner` * [Machine Learning](https://class.coursera.org/ml-003/class) - Andrew Ng (Coursera) - `Beginner` * [Machine Learning Video Library](http://work.caltech.edu/library/#!?goback=.gde_35222_member_5810981726511443971) - Yaser Abu-Mostafa - `Intermediate` * [Natural Language Processing](https://class.coursera.org/nlp/lecture/preview) - Dan Jurafsky and Christopher Manning (Coursera) - `Intermediate` * [Social and Economic Networks: Models and Analysis](https://class.coursera.org/networksonline-001/class) - Matthew O. Jackson (Coursera) - `Intermediate` * [Social Network Analysis](https://class.coursera.org/sna-003/class) - Lada Adamic (Coursera) - `Intermediate` * [Deep Learning](https://www.coursera.org/specializations/deep-learning) - Andrew Ng (Coursera) - `Intermediate` ## License [![CC0](http://mirrors.creativecommons.org/presskit/buttons/88x31/svg/cc-zero.svg)](https://creativecommons.org/publicdomain/zero/1.0/)