Coding for Machine Learning and Data Science
Module title | Coding for Machine Learning and Data Science |
---|---|
Module code | HPDM139 |
Academic year | 2021/2 |
Credits | 15 |
Module staff | Dr Thomas Monks (Convenor) |
Duration: Term | 1 | 2 | 3 |
---|---|---|---|
Duration: Weeks | 10 |
Number students taking module (anticipated) | 25 |
---|
Module description
Data science and machine learning are exciting rapidly evolving disciplines that offer huge potential for the future of health care, medicine and wider areas of science. To keep up with the pace of change a modern data scientist requires fundamental skills in coding. This module will:
• Boost your Python coding skills to a level where they are ready to undertake research and applied projects in data science and machine learning in health, medicine and general industry.
• Introduce you to the complexity of working with real world data in a health and medicine context.
• Introduce key machine learning concepts in supervised learning including an introduction to deep learning.
• Teach you coding skills that are transferable outside of health and medicine.
Module aims - intentions of the module
This module is suitable for students from a wide range of quantitative backgrounds who have some existing computer coding experience but wish to take these skills to a higher level. It will provide students working in health, medicine and wider scientific fields with the fundamental coding skills to conduct modern data science and machine learning.
The module is organised in two halves. In the first half of the module you will take a hands on approach to improving your existing Python skills, build a working knowledge of python’s data science libraries (NumPy, Pandas and MatplotLib), develop skills in data wrangling and gain an appreciation of a reproducible workflow. In the second half of the course, you will develop skills in machine learning used in research and practice. You will focus on working with complex data and be introduced to key machine learning infrastructure in Python.
The module will be suitable for students with varying levels of existing coding skills. The content will boost the skills of those students who have had no formal training in computing (e.g. those who have learnt online in their own time). In addition the module will reinforce the skills of students who have had formal training (e.g. in a computer science degree) and tailor them towards large complex health data challenges.
Intended Learning Outcomes (ILOs)
ILO: Module-specific skills
On successfully completing the module you will be able to...
- 1. Demonstrate competence in the fundamentals of coding in the python programme language and produce code to a standard suitable for cutting edge research and industry applications.
- 2. Analyse and manipulate complex data sets in health and demonstrate competence in building statistical and computational models to work with them in python.
ILO: Discipline-specific skills
On successfully completing the module you will be able to...
- 3. Apply a wide range of supervised machine learning algorithms to model outcomes in complex datasets.
- 4. Critically appraise data science problems and evaluate the tools that are needed to solve them.
ILO: Personal and key skills
On successfully completing the module you will be able to...
- 5. Use a wide range of python tools including modern data science tools to conduct quantitative analyses.
- 6. Explain and demonstrate the steps to follow in a reproducible scientific work flow used modern data science tools.
- 7. Explain the importance of coding for high quality data science and machine learning research.
Syllabus plan
Whilst the module’s precise content may vary from year to year, an example of an overall structure is as follows:
- An introduction to Linux and the OpenStack
- The basics and advanced concepts of coding in standard Python
- An introduction to Jupyter notebooks for data science and machine learning
- Reproducible workflows in python and introduction to GitHub.
- An introduction to NumPy, Pandas and MatplotLib
- Advanced data wrangling in Python, NumPy and Pandas
- An introduction to regression and classification in sklearn
- An introduction to deep learning in python for supervised learning
Learning activities and teaching methods (given in hours of study time)
Scheduled Learning and Teaching Activities | Guided independent study | Placement / study abroad |
---|---|---|
35 | 115 |
Details of learning activities and teaching methods
Category | Hours of study time | Description |
---|---|---|
Scheduled Learning and Teaching | 10 | Lectures (10 X 1 hour lectures) |
Scheduled Learning and Teaching | 20 | Workshops / tutorials (10 x 2 hours) |
Scheduled Learning and Teaching | 5 | Pre-recorded lectures on reproducible workflow (5 X 1 hour lectures) |
Guided Independent Study | 115 | Background reading and preparation for module assessments |
Formative assessment
Form of assessment | Size of the assessment (eg length / duration) | ILOs assessed | Feedback method |
---|---|---|---|
Computer lab exercises | 20 hours | 1-7 | Written answers to exercises. Verbal |
Seminar discussion | 2 hours | 1-7 | Verbal |
Summative assessment (% of credit)
Coursework | Written exams | Practical exams |
---|---|---|
100 | 0 | 0 |
Details of summative assessment
Form of assessment | % of credit | Size of the assessment (eg length / duration) | ILOs assessed | Feedback method |
---|---|---|---|---|
Coding assignment 1 | 50 | 1000 words | 1,4-7 | Written |
Coding assignment 2 | 50 | 1000 words | 2, 3-7 | Written |
Details of re-assessment (where required by referral or deferral)
Original form of assessment | Form of re-assessment | ILOs re-assessed | Timescale for re-assessment |
---|---|---|---|
Coding assignment 1 (50%), 1000 words | Coding assignment 1 | 1,4-7 | Typically within six weeks of the assignment. |
Coding assignment 2 (50%),1000 words | Coding assignment 2 | 2,3-7 | Typically within six weeks of the assignment. |
Re-assessment notes
Please refer to the TQA section on Referral/Deferral: http://as.exeter.ac.uk/academic-policy-standards/tqa-manual/aph/consequenceoffailure/
Indicative learning resources - Basic reading
Basic reading
• Lutz. Learning Python. (2013). 5th Edition. O’Reilly
• Mckinney (2017). Python for data analysis. 2nd Edition. O’Reilly
Advanced reading:
• James, Wittenm Hastie, Tibshirani (2017). An introduction to statistical learning. 7th Edition. Springer.
• Geron. (2020). Hands-on machine learning with SciKit-Learn, Keras and Tensorflow. 2nd Edition. (updated for Tensorflow 2.0).
Indicative learning resources - Web based and electronic resources
• ELE – College to provide hyperlink to appropriate pages
Credit value | 15 |
---|---|
Module ECTS | 7.5 |
Module pre-requisites | None |
Module co-requisites | None |
NQF level (module) | 7 |
Available as distance learning? | No |
Origin date | 12/01/2021 |
Last revision date | 12/01/2021 |