Study information

# Bayesian Philosophy and Methods in Data Science - 2023 entry

MODULE TITLE CREDIT VALUE Bayesian Philosophy and Methods in Data Science 15 MTHM508 Prof Daniel Williamson (Coordinator)
DURATION: TERM 1 2 3
DURATION: WEEKS 11
 Number of Students Taking Module (anticipated) 28
DESCRIPTION - summary of the module content

Since the 1980s, computational advances and novel algorithms have seen Bayesian methods explode in popularity, today underpinning modern techniques in data science and machine learning with applications across science, social science, the humanities and finance.

This module will introduce Bayesian statistics and reasoning. It will develop the philosophical and mathematical ideas of subjective probability theory for decision-making and explore the place subjectivity has in scientific reasoning. It will develop Bayesian methods for data analysis and introduce modern Bayesian simulation, including Markov Chain Monte Carlo and Hamiltonian Monte Carlo. The course balances philosophy, theory, mathematical calculation and analysis of real data ensuring the student is equipped to use Bayesian methods in future jobs aligned to data analysis and to take Bayesian research projects.

Pre-requisites: A basic introduction to probability and to classical statistics, plus experience of a programming language for data science such as R or Python. A preliminary online refresher course covering some basics in probability, integration and likelihood theory, supported by the module leader, is given alongside the first 2 weeks of the module to ensure students have the required knowledge to complete the course.

AIMS - intentions of the module

This module will cover the Bayesian approach to modelling, data analysis and statistical inference. The module describes the underpinning philosophies behind the Bayesian approach, looking at subjective probability theory, subjectivity in science as well as the notion and handling of prior knowledge, and the theory of decision making under uncertainty. Bayesian modelling and inference is studied in depth, looking at parameter estimation and inference in simple models and then hierarchical models. We explore simulation-based inference in Bayesian analyses and develop important algorithms for Bayesian simulation by Markov Chain Monte Carlo (MCMC) such the Gibbs sampler, Metropolis-Hastings and Hamiltonian Monte Carlo. We introduce decision theory with Bayes as a route to personalised decision making under uncertainty. The module aims to teach methods along with the mathematics to demonstrate why they work and the philosophy behind when, why and how they should be used. Unlike versions of this module with mathematics codes (MTH3041/MTHM047), the focus of the assessment is application, understanding and reasoning appropriate for data science students who have not completed a mathematics degree. It is not available to students on mathematics programmes (who may take the mathematics equivalent).

INTENDED LEARNING OUTCOMES (ILOs) (see assessment section below for how ILOs will be assessed)

Module Specific Skills and Knowledge:

1. Show understanding of the subjective approach to probabilistic reasoning.

2. Demonstrate an awareness of Bayesian approaches to statistical modelling and inference and an ability to apply them in practice.

3. Demonstrate understanding of the value of simulation-based inference and knowledge of techniques such as MCMC and the theories underpinning them.

4. Demonstrate the ability to apply statistical inference in decision-making.

5. Utilise appropriate software and a suitable computer language for Bayesian modelling and inference from data.

Discipline Specific Skills and Knowledge:
6. Demonstrate understanding, appreciation of and aptitude in the quantification of uncertainty using advanced mathematical modelling.

Personal and Key Transferable/ Employment Skills and  Knowledge:
7 Show Bayesian data analysis skills and be able to communicate associated reasoning and interpretations effectively in writing;
8. Apply relevant computer software competently;
9. Use learning resources appropriately;
10. Exemplify self-management and time-management skills.

SYLLABUS PLAN - summary of the structure and academic content of the module

Introduction: Bayesian vs Classical statistics, Nature of probability and uncertainty, Subjectivism.

Bayesian inference: Conjugate models, Prior and Posterior predictive distributions, Posterior summaries and simulation, Objective and subjective priors, Normal approximation, Bernstein Von-mises results Bayesian Hierarchical models, Bayesian regression and logistic regression.

Bayesian Computation: Monte Carlo, Inverse CDF, Rejection Sampling, Importance Sampling, Markov Chain Monte Carlo (MCMC), The Gibbs sampler, Metropolis Hastings, Hamiltonian Monte Carlo.

Decision Theory: Bayes’ rule, Decision trees, Utility theory.

LEARNING AND TEACHING
LEARNING ACTIVITIES AND TEACHING METHODS (given in hours of study time)
 Scheduled Learning & Teaching Activities Guided Independent Study 33 117
DETAILS OF LEARNING ACTIVITIES AND TEACHING METHODS
 Category Hours of study time Description Scheduled learning and teaching activities 33 Lectures/practical classes Guided independent study 33 Post-lecture study and reading Guided independent study 40 Formative and summative coursework preparation, attempting un-assessed problems Guided independent study 44 Exam revision/preparation

ASSESSMENT
FORMATIVE ASSESSMENT - for feedback and development purposes; does not count towards module grade
Form of Assessment Size of Assessment (e.g. duration/length) ILOs Assessed Feedback Method
Practical and theoretical exercises 11 hours (1 hour each week) All Verbal, in class and written on script

SUMMATIVE ASSESSMENT (% of credit)
 Coursework Written Exams 50 50
DETAILS OF SUMMATIVE ASSESSMENT
Form of Assessment % of Credit Size of Assessment (e.g. duration/length) ILOs Assessed Feedback Method
Written exam – Restricted Note (1 A4 Sheet (2 sides)
of typed or handwritten notes)
50 2 hours (Summer) 1-7, 9, 10 Verbal on specific request
Coursework - practical and theoretical exercises I 25 15 hours All Written feedback on script and oral feedback in office hour.
Coursework - practical and theoretical exercises II 25 15 hours All Written feedback on script and oral feedback in office hour.

DETAILS OF RE-ASSESSMENT (where required by referral or deferral)
Original Form of Assessment Form of Re-assessment ILOs Re-assessed Time Scale for Re-assessment
Written exam * Written exam (2 hours) 1-7, 9, 10 August Ref/Def period
Coursework 1 * Coursework 1 All August Ref/Def period
Coursework 1 * Coursework 2 All August Ref/Def period

*Please refer to reassessment notes for details on deferral vs. Referral reassessment

RE-ASSESSMENT NOTES

Deferrals: Reassessment will be by coursework and/or written exam in the deferred element only. For deferred candidates, the module mark will be uncapped.

Referrals: Reassessment will be by a single written exam  worth 100% of the module only. As it is a referral, the mark will be capped at 50%.

RESOURCES
INDICATIVE LEARNING RESOURCES - The following list is offered as an indication of the type & level of
information that you are expected to consult. Further guidance will be provided by the Module Convener