Skip to main content

Study information

Introduction to Declarative Computer Programming

Module titleIntroduction to Declarative Computer Programming
Module codeBIO2104
Academic year2023/4
Credits15
Module staff

Professor Robert Beardmore (Lecturer)

Duration: Term123
Duration: Weeks

11

Number students taking module (anticipated)

30

Module description

The analysis of data is an increasingly important component for working in the biological sciences, particularly in industries that exploit Artificial Intelligence, of which there are many. Often (usually) spreadsheet software does not provide a viable solution for real-world datasets because the problem to be solved is either too large or too complex. In those situations, we need to use computer languages to generate, curate, clean, analyse and plot our data.

These steps form a data lifecycle for which many languages can be used, including R, Python, Matlab, Octave, SPSS, C++, Fortran, Pascal, Visual Basic, Java, and more besides.

Different people prefer different languages and the purpose of this module is to provide a starting point so you can find the one of most value to you. Indeed, different tasks are often best accomplished in different languages because each language has its own particular niche or characteristic. This course introduces Matlab and Python because (i) Matlab has extensive, professionally-supported ‘help’ facilities which are great for a new ‘coder’, and (ii) Python is rapidly becoming a global standard for handling data with extensive libraries that have applications in imaging, microscopy, genome sequencing and much more besides.

No coding prerequisites are assumed for this module, but an understanding of very clear, logical thinking will be necessary, and an appreciation of mathematical notation will also be useful. Some simple illustrative and real biological problems will be presented and solved using algorithms coded in Matlab where the emphasis will be placed on the simplest problems that provide a foundation for future work. This is not a course in imaging or bioinformatics, but a general introduction to common constructs used when coding with scientific data.

Module aims - intentions of the module

This module will first cover the very basics of coding in a declarative programming language, focusing on Matlab and explaining how to translate some of Matlab’s concepts into Python using Spyder. This will be done by explaining how to write concepts as ‘pseudocode’ and then how to translate that code into common programming constructs. The bulk of the module will focus on Matlab due to its extensive help and online community, Python will be only very briefly presented due to its relevance to the growing AI industry and its similarity to Matlab. You will then be well-placed to learn Python through independent study online or as part of more advanced courses.

 

Deeper concepts and algorithmic frameworks that build on coding fundamentals will be presented, with illustrative real-world applications as small case studies taken from:

 

1)    Linear and nonlinear (e.g. quadratic) regression (applied to, for example, drug efflux data);

2)    Iterative methods (like 1-d Newton’s Method for equation solving) and Picard iterations for 2-d (nonlinear simultaneous) equation solving, illustrating algorithm convergence and non-convergence, fractal art;

3)    The Hungarian algorithm for optimal point matching with applications to protein structure calculations, some involving indel mutants;

4)    Gauss Mixture Modelling for the analysis of open clinical antibiotic data (based on EUCAST’s MIC data);

5)    Detecting features of interest in an imported JPEG image, applied to genetically-labelled images;

6)    Clustering in Matlab: screening a 4000-strain Escherichia coli gene knockout library to find strains with different antibiotic resistance characteristics;

7)    A brief introduction to neural networks and how to train one using a real-world clinical dataset.

Intended Learning Outcomes (ILOs)

ILO: Module-specific skills

On successfully completing the module you will be able to...

  • 1. Develop pseudocode as a way of representing an algorithm independently of a computer language
  • 2. Write and execute code in Matlab that uses variables, scripts and functions.
  • 3. Debug code, trace variables, correct identified bugs and improve inefficient code in some cases
  • 4. Import and export data (e.g. as CSV), plot data and save them as PDFs within scripts or functions
  • 5. Recognise simple elements Python and understand how to get online help to learn new Python functions
  • 6. Document code and data

ILO: Discipline-specific skills

On successfully completing the module you will be able to...

  • 7. Recognise some common data analysis problems and apply computational methods to solve them.
  • 8. Code simple tasks in Python using Spyder, using the internet to seek answers to coding questions

ILO: Personal and key skills

On successfully completing the module you will be able to...

  • 9. Understand the difference between good and bad practice when presenting scientific (or other) data
  • 10. Understand why good documentation is needed when handling data
  • 11. Create and follow logical pathways that solve a specified problem
  • 12. Keep and documenting proper records of work
  • 13. Produce publication-quality plots
  • 14. Statistics associated with solving problems in biology

Syllabus plan

Whilst the module’s precise content may vary from year to year, an example of an overall structure is as follows:

1)    Interpreted languages (Python & Matlab) versus C (or Lisp, Prolog, ML, Fortran, Assembly, etc) and their philosophical differences. Pseudocode as an informal way of specifying language-free algorithms;

2)    Matlab variables: automatically detecting or specifying variable types (Booleans, single/double precision, integers, chars, strings, globals and variable scope);

3)    Pre-allocating variables for speed, and why this is necessary;

4)    Controlling program flow with if .. then .. else … statements (and, or, =, ==) and case or switch commands;

5)    Repeatedly looping over data using for and while loops until a criterion is satisfied and so an algorithm can be said to have ‘converged’ to a solution or else is terminated;

6)    The difference between scripts, functions and nested functions;

7)    Vectorisation: why vectorisation can outperform for loops in terms of execution speed;

8)    Using a debugger and variable tracing to find errors;

9)    Importing and exporting data in common formats such as XLS or CSV;

10)  Plotting data, choosing colour schemes and styles to maximise the meaning of those data, exporting plots as PDF;

11)  Measuring algorithm and code (in)efficiencies using tic, toc and using the M-code profiler to improve this;

12)  Matlab script-to-HTML conversion;

13)  Using new modules not part of standard Matlab or Mathworks toolboxes that were written by others;

14)  Comparing some simple Matlab code examples with their Python equivalents.

 

Learning activities and teaching methods (given in hours of study time)

Scheduled Learning and Teaching ActivitiesGuided independent studyPlacement / study abroad
301200

Details of learning activities and teaching methods

CategoryHours of study timeDescription
Scheduled learning & teaching activities2020x 1h lectures and workshops
Scheduled learning & teaching activities1010x 1h computer-based practicals
Guided independent study60Lecture/workshop consolidation and associated reading
Guided independent study20Consolidation of feedback from computer-based practicals
Guided independent study40Completion of coursework and preparation for practical exam

Formative assessment

Form of assessmentSize of the assessment (eg length / duration)ILOs assessedFeedback method
Lecturer and/or GTA feedback during workshops and computer-based practicalsAd hoc1-14Oral

Summative assessment (% of credit)

CourseworkWritten examsPractical exams
50050

Details of summative assessment

Form of assessment% of creditSize of the assessment (eg length / duration)ILOs assessedFeedback method
Documented data analysis pipeline50Approx. 7 pages1-14Written
Practical computer-based exam501 hour1-14Written
0
0
0
0

Details of re-assessment (where required by referral or deferral)

Original form of assessmentForm of re-assessmentILOs re-assessedTimescale for re-assessment
Documented data analysis pipeline (50%), Approx. 7 pagesDocumented data analysis pipeline (50%)1-14August Ref/Def
Practical computer-based exam (50%), 1 hourPractical computer-based exam (50%)1-14August Ref/Def

Re-assessment notes

Deferral – if you miss an assessment for certificated reasons that are approved by the Mitigation Committee, you will normally be either deferred in the assessment or an extension may be granted. If deferred, the format and timing of the re-assessment for each of the summative assessments is detailed in the table above ('Details of re-assessment'). The mark given for a deferred assessment will not be capped and will be treated as it would be if it were your first attempt at the assessment.

Referral - if you have failed the module (i.e. a final overall module mark of less than 40%) and the module cannot be condoned, you will be required to complete a re-assessment for each of the failed components on the module. The format and timing of the re-assessment for each of the summative assessments is detailed in the table above ('Details of re-assessment'). If you pass the module following re-assessment, your module mark will be capped at 40%.

Indicative learning resources - Basic reading

• Books are not needed, all help-based resources are available within Matlab itself. There are also many online teaching resources accessible from the websites below.

Many other resources will be provided during the course, each appropriate to lecture content.

Indicative learning resources - Web based and electronic resources

 

Key words search

Programming, Matlab, Python, Algorithms, Artificial Intelligence, Machine Learning

Credit value15
Module ECTS

7.5

Module pre-requisites

BIO1333 Fundamental Principles for Bioscientists

Module co-requisites

None

NQF level (module)

6

Available as distance learning?

No

Origin date

22/02/2023

Last revision date

30/10/2023