Introduction to data science
Module title | Introduction to data science |
---|---|
Module code | GEO1419 |
Academic year | 2021/2 |
Credits | 15 |
Module staff | Dr Jo Browse (Convenor) |
Duration: Term | 1 | 2 | 3 |
---|---|---|---|
Duration: Weeks | 11 |
Number students taking module (anticipated) | 100 |
---|
Module description
This module will give you practical insights into how scientists address fundamental questions and hypotheses using data. We start with simple toolkits to describe data and move on to more advanced ways of comparing data and describing data trends. Once you finish the module you will be competent in managing data and handling data using the statistical programming language R, and you will know how to critique different methods commonly used in scientific data analysis.
This module uses a combination of lectures, group discussions, supervised practical classes, online (ELE) teaching resources, and help sessionsto provide you with the support necessary for achieving the learning objectives. Weekly lectures provide a synoptic overview of the techniques covered in each week’s practical class, while group discussions will evaluate and critique the use of these techniques in published scientific studies. Each practical class is led by lecturing staff, and support staff. Staff are also accessible through an online (ELE) discussion forum to answer your queries. The emphasis is placed upon learning how to apply statistical techniques to answer research questions in geography, environmental science and marine science using numerical data of various forms. As such, data from a range of environmental applications are provided in the practical classes for analysis. Weekly summative assessment, completed online, provides you with the opportunity to evaluate your progress, since these tests cover the same techniques (different data) to those learned in the lectures and practical sessions. Practical classes also focus on developing essential IT skills where R is used for data manipulation and analysis, allowing you to learn key transferable skills in data science..
Module aims - intentions of the module
This module aims to introduce you to the use of data-centred quantitative analysis techniques in research. The module will establish the purpose and scope of statistical analysis methods, focusing on analytical tests and their execution. We follow the ‘scientific method’ through from first principles (hypothesis development, distribution testing) to hypothesis testing. We ask you to think about the underlying principles of data collection, sampling and hypothesis-driven research. We use computers to assist us in the aggregation, analysis and presentation of data.
Through lectures, assisted practical classes, and group discussion you will be encouraged to evaluate and critique statistical methods as one of a suite of analytical techniques available to researchers. Assisted practical classes complement the lecture series and will provide you with key transferable skills in data handling which will increase your future employability. You will undertake an independent research project during which you will quantitatively explore an unseen dataset using the skills acquired during the module. These skills are relevant for a range of different careers from environmental management and assessment through to energy policy
Intended Learning Outcomes (ILOs)
ILO: Module-specific skills
On successfully completing the module you will be able to...
- 1. Describe and critique a range of approaches to collecting data
- 2. Critique poor statistical or data collection techniques across geographical and environmental fields
- 3. Calculate and understand the use of basic descriptive statistics including the mean, median, mode, standard deviation and coefficient of variation
- 4. Discuss the limitations associated with different descriptive statistics in your own and others work
- 5. Apply appropriate techniques to determine whether data are normally distributed and explain the role of gaussian distributions in statistical approaches
- 6. Explain the difference between parametric and non-parametric tests..
- 7. Choose the correct statistical test for different data distributions
- 8. Use the statistical programming language R , to apply appropriate statistical test in order to answer research questions.
- 9. Understand statistical significance and interpret p-values
ILO: Discipline-specific skills
On successfully completing the module you will be able to...
- 10. Describe essential facts and theory across data management and analysis in geography and the naturall sciences
- 11. Identify critical questions from the literature and synthesise research-informed examples into written work
- 12. Identify and implement, with some guidance, appropriate methodologies and theories for addressing a specific research problem in geography and the natural sciences
- 13. With guidance, deploy established techniques of data science including collection, analysis and management within geography and the natural sciences
- 14. Describe and begin to evaluate approaches to the development of research questions in geography and the natural sciences with reference to primary literature, reviews and research articles
ILO: Personal and key skills
On successfully completing the module you will be able to...
- 15. Develop, with guidance, a logical and reasoned argument with sound conclusions
- 16. Communicate ideas, principles and theories using a variety of formats in a manner appropriate to the intended audience
- 17. Collect and interpret appropriate data and undertake straightforward research tasks with guidance
- 18. Evaluate own strengths and weaknesses in relation to professional and practical skills identified by others
- 19. Reflect on learning experiences and summarise personal achievements
Syllabus plan
There will be several key themes covered in this module as follows:
- Lecture: Introduction to module and overview of subject
- Lecture: Introduction to research design;
- Descriptive statistics; central tendency and dispersion.
- Practical: Calculate and display descriptive statistics showing key data attributes (e.g. in Excel)
- Lecture: Theoretical frequency distributions
- Practical: Exploring frequency distributions using computer software
- Lecture: Parametric inferential statistics
- Practical: Parametric hypothesis testing
- Lecture: Counts and frequencies, non-parametric techniques
- Practical: Non?parametric hypothesis testing
- Lecture: Correlation analysis
- Practical: Exploring correlation
- Lecture: Linear regression
- Practical: Modeling data trends
- Lecture: Transforming data and alternative distributions
Learning activities and teaching methods (given in hours of study time)
Scheduled Learning and Teaching Activities | Guided independent study | Placement / study abroad |
---|---|---|
30 | 120 | 0 |
Details of learning activities and teaching methods
Category | Hours of study time | Description |
---|---|---|
Scheduled Learning and Teaching | 10 | Group discussion |
Scheduled Learning and Teaching | 20 | Practicals |
Guided Independent Study | 120 | Additional research, reading and preparation for module assessments and group discussions |
Formative assessment
Form of assessment | Size of the assessment (eg length / duration) | ILOs assessed | Feedback method |
---|---|---|---|
Short answer questions during lectures and practical sessions | Ongoing throughout the module | 1-16, 18-19 | Oral |
Summative assessment (% of credit)
Coursework | Written exams | Practical exams |
---|---|---|
70 | 0 | 30 |
Details of summative assessment
Form of assessment | % of credit | Size of the assessment (eg length / duration) | ILOs assessed | Feedback method |
---|---|---|---|---|
Weekly tests | 30 | Not applicable | 1-17 | Model answers |
Statistics project | 70 | 1000 words | 1-17 | Written |
Details of re-assessment (where required by referral or deferral)
Original form of assessment | Form of re-assessment | ILOs re-assessed | Timescale for re-assessment |
---|---|---|---|
Weekly tests | Not applicable | Not applicable | Not applicable |
Statistics project | Statistics project | 1-17 | August Assessment Period |
Re-assessment notes
Deferral – if you miss an assessment for certificated reasons judged acceptable by the Mitigation Committee, you will normally be either deferred in the assessment or an extension may be granted. The weekly tests are not deferrable because of their cumulative and practical nature. The mark given for a re-assessment taken as a result of deferral will not be capped and will be treated as it would be if it were your first attempt at the assessment.
Referral – if you have failed the module overall (i.e. a final overall module mark of less than 40%) you will be required to complete a further statistics project. The mark given for a re-assessment taken as a result of referral will count for 100% of the final mark and will be capped at 40%.
Indicative learning resources - Basic reading
- Grolemund, Garrett, Hands-on programming with R, First edition. Sebastopol, Calif. : O'Reilly, 2014
- Matloff, Norman S.,The art of R programming : tour of statistical software design, San Francisco : No Starch Press, 2011.
- Rogerson, Peter,. A., Statistical methods for geography, London : SAGE, 2001.
Credit value | 15 |
---|---|
Module ECTS | 7.5 |
Module pre-requisites | None |
Module co-requisites | None |
NQF level (module) | 4 |
Available as distance learning? | No |
Origin date | 03/02/2021 |
Last revision date | 03/02/2021 |