Skip to main content

Study information

Text Mining and Natural Language Processing - 2025 entry

MODULE TITLEText Mining and Natural Language Processing CREDIT VALUE15
MODULE CODECOMM040 MODULE CONVENERUnknown
DURATION: TERM 1 2 3
DURATION: WEEKS 10
Number of Students Taking Module (anticipated) 40
DESCRIPTION - summary of the module content

Text mining is the process of extracting insight from large collections of written documents. Recently, there has been immense progress in how computers understand human language. This means reviews, tweets, archives of legal documents, recipes and all kinds of text can now be effectively analysed. This module teaches you how to search, group, summarise and understand large corpuses of documents. The course will cover methods like topic modelling, sentiment analysis, translation and the use of Large Language Models to solve real world problems. The student should have taken or be taking a module on the basics of machine learning.

AIMS - intentions of the module

Students will understand and apply modern NLP methods to real world textual datasets. The focus will be on methods for generating insight from large collections of text, from practical first steps, like data cleaning and validation, to topic modelling or text summarisation using a variety of cutting-edge techniques.

INTENDED LEARNING OUTCOMES (ILOs) (see assessment section below for how ILOs will be assessed)

On successful completion of this module you should be able to:

Module Specific Skills and Knowledge:

  1. Apply NLP methods to realistic data sets, demonstrating the ability to clean, transform, analyse and interpret the output of NLP algorithms.

  2. Understand the strengths, weaknesses and biases of different NLP methods

Discipline Specific Skills and Knowledge:

  1. Clean, parse, transform and process large collections of text documents

  2. Know about recent advances and applications of NLP methods, their impact and real world applications.

Personal and Key Transferable/ Employment Skills and Knowledge:

  1. Use modern NLP techniques to derive insight from unstructured data

  2. Communicate the pros and cons and trade-offs of different approaches to text analysis

SYLLABUS PLAN - summary of the structure and academic content of the module
  • Regular expressions, tokenisation, n-gram models

  • Embeddings

  • Text classification and topic modelling

  • Sentiment analysis

  • Using BERT, GPT and large language models

  • Chatbots and RAG systems

  • Machine translation

  • Advanced topics and applications

LEARNING AND TEACHING
LEARNING ACTIVITIES AND TEACHING METHODS (given in hours of study time)
Scheduled Learning & Teaching Activities 36 Guided Independent Study 114 Placement / Study Abroad 0
DETAILS OF LEARNING ACTIVITIES AND TEACHING METHODS
Category  Hours of study time  Description 
Scheduled Learning and Teaching 24 Lectures
Scheduled Learning and Teaching 12 Workshops
Guided Independent Study 114 Reading, Workshop preparation

 

ASSESSMENT
FORMATIVE ASSESSMENT - for feedback and development purposes; does not count towards module grade
Form of Assessment Size of the assessment e.g. duration/length ILOs assessed Feedback method
Workshops 12 All In person, verbal discussion with TA/Module lead

 

SUMMATIVE ASSESSMENT (% of credit)
Coursework 100 Written Exams 0 Practical Exams 0
DETAILS OF SUMMATIVE ASSESSMENT
Form of Assessment % of credit Size of the assessment e.g. duration/length ILOs assessed  Feedback method
Coursework - presentation 30 15-minute presentation  All Written
Coursework – Mini project 70 Approx 3000-word report All Written

 

DETAILS OF RE-ASSESSMENT (where required by referral or deferral)
Original form of assessment Form of re-assessment  ILOs re-assessed Time scale for re-assessment
Mini project Mini project + approx. 3000-word report All Referral/deferral period
Presentation Mini project + approx. 3000-word report All Referral/deferral period

 

RE-ASSESSMENT NOTES
Deferral – if you miss an assessment for certificated reasons judged acceptable by the Mitigation Committee, you will normally be either deferred in the assessment or an extension may be granted. The mark given for a re-assessment taken as a result of deferral will not be capped and will be treated as it would be if it were your first attempt at the assessment.
 
Referral – if you have failed the module overall (i.e. a final overall module mark of less than 50%) you will be required to submit a further assessment as necessary. If you are successful on referral, your overall module mark will be capped at 50%.
RESOURCES
INDICATIVE LEARNING RESOURCES - The following list is offered as an indication of the type & level of
information that you are expected to consult. Further guidance will be provided by the Module Convener

Basic reading:

  • Jurafsky, Martin, 2024, Speech and Language Processing 3rd ed
  • Alammar, Grootendorst, 2024, Hands on Large Language Models, O’Reilly
  • Singh, 2023, Natural Language Processing in the Real World, Routledge

Web-based and electronic resources:

  • ELE

Reading list for this module:

There are currently no reading list entries found for this module.

CREDIT VALUE 15 ECTS VALUE 7.5
PRE-REQUISITE MODULES None
CO-REQUISITE MODULES None
NQF LEVEL (FHEQ) 7 AVAILABLE AS DISTANCE LEARNING No
ORIGIN DATE Thursday 28th November 2024 LAST REVISION DATE Wednesday 21st May 2025
KEY WORDS SEARCH Text Mining; Natural Language Processing

Please note that all modules are subject to change, please get in touch if you have any questions about this module.