Description
If you know how to program, you have the skills to turn data into knowledge using the tools of probability and statistics. This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process-from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.
* Develop your understanding of probability and statistics by writing and testing code
* Run experiments to test statistical behavior, such as generating samples from several distributions
* Use simulations to understand concepts that are hard to grasp mathematically
* Learn topics not usually covered in an introductory course, such as Bayesian estimation
* Import data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics tools
* Use statistical inference to answer questions about real-world data
CONTENTS:
Preface; Why I Wrote This Book; How I Wrote This Book; Contributor List; Conventions Used in This Book; Using Code Examples; Safari(R) Books Online; How to Contact Us; Chapter 1: Statistical Thinking for Programmers; 1.1 Do First Babies Arrive Late?; 1.2 A Statistical Approach; 1.3 The National Survey of Family Growth; 1.4 Tables and Records; 1.5 Significance; 1.6 Glossary; Chapter 2: Descriptive Statistics; 2.1 Means and Averages; 2.2 Variance; 2.3 Distributions; 2.4 Representing Histograms; 2.5 Plotting Histograms; 2.6 Representing PMFs; 2.7 Plotting PMFs; 2.8 Outliers; 2.9 Other Visualizations; 2.10 Relative Risk; 2.11 Conditional Probability; 2.12 Reporting Results; 2.13 Glossary; Chapter 3: Cumulative Distribution Functions; 3.1 The Class Size Paradox; 3.2 The Limits of PMFs; 3.3 Percentiles; 3.4 Cumulative Distribution Functions; 3.5 Representing CDFs; 3.6 Back to the Survey Data; 3.7 Conditional Distributions; 3.8 Random Numbers; 3.9 Summary Statistics Revisited; 3.10 Glossary; Chapter 4: Continuous Distributions; 4.1 The Exponential Distribution; 4.2 The Pareto Distribution; 4.3 The Normal Distribution; 4.4 Normal Probability Plot; 4.5 The Lognormal Distribution; 4.6 Why Model?; 4.7 Generating Random Numbers; 4.8 Glossary; Chapter 5: Probability; 5.1 Rules of Probability; 5.2 Monty Hall; 5.3 Poincaré; 5.4 Another Rule of Probability; 5.5 Binomial Distribution; 5.6 Streaks and Hot Spots; 5.7 Bayes’s Theorem; 5.8 Glossary; Chapter 6: Operations on Distributions; 6.1 Skewness; 6.2 Random Variables; 6.3 PDFs; 6.4 Convolution; 6.5 Why Normal?; 6.6 Central Limit Theorem; 6.7 The Distribution Framework; 6.8 Glossary; Chapter 7: Hypothesis Testing; 7.1 Testing a Difference in Means; 7.2 Choosing a Threshold; 7.3 Defining the Effect; 7.4 Interpreting the Result; 7.5 Cross-Validation; 7.6 Reporting Bayesian Probabilities; 7.7 Chi-Square Test; 7.8 Efficient Resampling; 7.9 Power; 7.10 Glossary; Chapter 8: Estimation; 8.1 The Estimation Game; 8.2 Guess the Variance; 8.3 Understanding Errors; 8.4 Exponential Distributions; 8.5 Confidence Intervals; 8.6 Bayesian Estimation; 8.7 Implementing Bayesian Estimation; 8.8 Censored Data; 8.9 The Locomotive Problem; 8.10 Glossary; Chapter 9: Correlation; 9.1 Standard Scores; 9.2 Covariance; 9.3 Correlation; 9.4 Making Scatterplots in Pyplot; 9.5 Spearman’s Rank Correlation; 9.6 Least Squares Fit; 9.7 Goodness of Fit; 9.8 Correlation and Causation; 9.9 Glossary; Colophon;
Published
29 Jul 2011
Publisher
O'REILLY & ASSOCIATES
ISBN
9781449307110
Pages
119




Static Book Details Index Page - Click Here to go to Computer Manuals Website