Introduction to Declarative Computer Programming
Module title | Introduction to Declarative Computer Programming |
---|---|
Module code | BIO2104 |
Academic year | 2024/5 |
Credits | 15 |
Module staff | Professor Robert Beardmore (Lecturer) |
Duration: Term | 1 | 2 | 3 |
---|---|---|---|
Duration: Weeks | 11 |
Number students taking module (anticipated) | 30 |
---|
Module description
The analysis of data is an increasingly important component for working in the biological sciences, particularly in industries that exploit Artificial Intelligence, of which there are many. Often (usually) spreadsheet software does not provide a viable solution for real-world datasets because the problem to be solved is either too large or too complex. In those situations, we need to use computer languages to generate, curate, clean, analyse and plot our data.
These steps form a data lifecycle for which many languages can be used, including R, Python, Matlab, Octave, SPSS, C++, Fortran, Pascal, Visual Basic, Java, and more besides.
Different people prefer different languages and the purpose of this module is to provide a starting point so you can find the one of most value to you. Indeed, different tasks are often best accomplished in different languages because each language has its own particular niche or characteristic. This course introduces Matlab and Python because (i) Matlab has extensive, professionally-supported ‘help’ facilities which are great for a new ‘coder’, and (ii) Python is rapidly becoming a global standard for handling data with extensive libraries that have applications in imaging, microscopy, genome sequencing and much more besides.
No coding prerequisites are assumed for this module, but an understanding of very clear, logical thinking will be necessary, and an appreciation of mathematical notation will also be useful. Some simple illustrative and real biological problems will be presented and solved using algorithms coded in Matlab where the emphasis will be placed on the simplest problems that provide a foundation for future work. This is not a course in imaging or bioinformatics, but a general introduction to common constructs used when coding with scientific data.
Module aims - intentions of the module
This module will first cover the very basics of coding in a declarative programming language, focusing on Matlab and explaining how to translate some of Matlab’s concepts into Python using Spyder. This will be done by explaining how to write concepts as ‘pseudocode’ and then how to translate that code into common programming constructs. The bulk of the module will focus on Matlab due to its extensive help and online community, Python will be only very briefly presented due to its relevance to the growing AI industry and its similarity to Matlab. You will then be well-placed to learn Python through independent study online or as part of more advanced courses.
Deeper concepts and algorithmic frameworks that build on coding fundamentals will be presented, with illustrative real-world applications as small case studies taken from:
1) Linear and nonlinear (e.g. quadratic) regression (applied to, for example, drug efflux data);
2) Iterative methods (like 1-d Newton’s Method for equation solving) and Picard iterations for 2-d (nonlinear simultaneous) equation solving, illustrating algorithm convergence and non-convergence, fractal art;
3) The Hungarian algorithm for optimal point matching with applications to protein structure calculations, some involving indel mutants;
4) Gauss Mixture Modelling for the analysis of open clinical antibiotic data (based on EUCAST’s MIC data);
5) Detecting features of interest in an imported JPEG image, applied to genetically-labelled images;
6) Clustering in Matlab: screening a 4000-strain Escherichia coli gene knockout library to find strains with different antibiotic resistance characteristics;
7) A brief introduction to neural networks and how to train one using a real-world clinical dataset.
Intended Learning Outcomes (ILOs)
ILO: Module-specific skills
On successfully completing the module you will be able to...
- 1. Develop pseudocode as a way of representing an algorithm independently of a computer language
- 2. Write and execute code in Matlab that uses variables, scripts and functions.
- 3. Debug code, trace variables, correct identified bugs and improve inefficient code in some cases
- 4. Import and export data (e.g. as CSV), plot data and save them as PDFs within scripts or functions
- 5. Recognise simple elements Python and understand how to get online help to learn new Python functions
- 6. Document code and data
ILO: Discipline-specific skills
On successfully completing the module you will be able to...
- 7. Recognise some common data analysis problems and apply computational methods to solve them.
- 8. Code simple tasks in Python using Spyder, using the internet to seek answers to coding questions
ILO: Personal and key skills
On successfully completing the module you will be able to...
- 9. Understand the difference between good and bad practice when presenting scientific (or other) data
- 10. Understand why good documentation is needed when handling data
- 11. Create and follow logical pathways that solve a specified problem
- 12. Keep and documenting proper records of work
- 13. Produce publication-quality plots
- 14. Statistics associated with solving problems in biology
Syllabus plan
Whilst the module’s precise content may vary from year to year, an example of an overall structure is as follows:
1) Interpreted languages (Python & Matlab) versus C (or Lisp, Prolog, ML, Fortran, Assembly, etc) and their philosophical differences. Pseudocode as an informal way of specifying language-free algorithms;
2) Matlab variables: automatically detecting or specifying variable types (Booleans, single/double precision, integers, chars, strings, globals and variable scope);
3) Pre-allocating variables for speed, and why this is necessary;
4) Controlling program flow with if .. then .. else … statements (and, or, =, ==) and case or switch commands;
5) Repeatedly looping over data using for and while loops until a criterion is satisfied and so an algorithm can be said to have ‘converged’ to a solution or else is terminated;
6) The difference between scripts, functions and nested functions;
7) Vectorisation: why vectorisation can outperform for loops in terms of execution speed;
8) Using a debugger and variable tracing to find errors;
9) Importing and exporting data in common formats such as XLS or CSV;
10) Plotting data, choosing colour schemes and styles to maximise the meaning of those data, exporting plots as PDF;
11) Measuring algorithm and code (in)efficiencies using tic, toc and using the M-code profiler to improve this;
12) Matlab script-to-HTML conversion;
13) Using new modules not part of standard Matlab or Mathworks toolboxes that were written by others;
14) Comparing some simple Matlab code examples with their Python equivalents.
Learning activities and teaching methods (given in hours of study time)
Scheduled Learning and Teaching Activities | Guided independent study | Placement / study abroad |
---|---|---|
30 | 120 | 0 |
Details of learning activities and teaching methods
Category | Hours of study time | Description |
---|---|---|
Scheduled learning & teaching activities | 20 | 20x 1h lectures and workshops |
Scheduled learning & teaching activities | 10 | 10x 1h computer-based practicals |
Guided independent study | 60 | Lecture/workshop consolidation and associated reading |
Guided independent study | 20 | Consolidation of feedback from computer-based practicals |
Guided independent study | 40 | Completion of coursework and preparation for practical exam |
Formative assessment
Form of assessment | Size of the assessment (eg length / duration) | ILOs assessed | Feedback method |
---|---|---|---|
Lecturer and/or GTA feedback during workshops and computer-based practicals | Ad hoc | 1-14 | Oral |
Summative assessment (% of credit)
Coursework | Written exams | Practical exams |
---|---|---|
50 | 0 | 50 |
Details of summative assessment
Form of assessment | % of credit | Size of the assessment (eg length / duration) | ILOs assessed | Feedback method |
---|---|---|---|---|
Documented data analysis pipeline | 50 | Approx. 7 pages | 1-14 | Written |
Practical computer-based exam | 50 | 1 hour | 1-14 | Written |
0 | ||||
0 | ||||
0 | ||||
0 |
Details of re-assessment (where required by referral or deferral)
Original form of assessment | Form of re-assessment | ILOs re-assessed | Timescale for re-assessment |
---|---|---|---|
Documented data analysis pipeline (50%), Approx. 7 pages | Documented data analysis pipeline (50%) | 1-14 | August Ref/Def |
Practical computer-based exam (50%), 1 hour | Practical computer-based exam (50%) | 1-14 | August Ref/Def |
Re-assessment notes
Deferral – if you miss an assessment for certificated reasons that are approved by the Mitigation Committee, you will normally be either deferred in the assessment or an extension may be granted. If deferred, the format and timing of the re-assessment for each of the summative assessments is detailed in the table above ('Details of re-assessment'). The mark given for a deferred assessment will not be capped and will be treated as it would be if it were your first attempt at the assessment.
Referral - if you have failed the module (i.e. a final overall module mark of less than 40%) and the module cannot be condoned, you will be required to complete a re-assessment for each of the failed components on the module. The format and timing of the re-assessment for each of the summative assessments is detailed in the table above ('Details of re-assessment'). If you pass the module following re-assessment, your module mark will be capped at 40%.
Indicative learning resources - Basic reading
• Books are not needed, all help-based resources are available within Matlab itself. There are also many online teaching resources accessible from the websites below.
Many other resources will be provided during the course, each appropriate to lecture content.
Indicative learning resources - Web based and electronic resources
- https://www.mathworks.com
- https://www.python.org
- https://www.mathworks.com/help/matlab/getting-started-with-matlab.html
Credit value | 15 |
---|---|
Module ECTS | 7.5 |
Module pre-requisites | BIO1333 Fundamental Principles for Bioscientists |
Module co-requisites | None |
NQF level (module) | 6 |
Available as distance learning? | No |
Origin date | 22/02/2023 |
Last revision date | 30/10/2023 |