- Student Records
Programme & Unit Catalogues

CM50267: Software technologies for data science

[Page last updated: 10 August 2020]

Follow this link for further information on academic years Academic Year: 2020/1
Further information on owning departmentsOwning Department/School: Department of Computer Science
Further information on credits Credits: 12      [equivalent to 24 CATS credits]
Further information on notional study hours Notional Study Hours: 240
Further information on unit levels Level: Masters UG & PG (FHEQ level 7)
Further information on teaching periods Period:
Semester 1
Further information on unit assessment Assessment Summary: CW100
Further information on supplementary assessment Supplementary Assessment:
Like-for-like reassessment (where allowed by programme regulations)
Further information on requisites Requisites:
Description: Aims:
To teach numeric programming in a relevant language (for example, Python), how to undertake lower-level data science using that language (and its associated libraries), and how to scale up and apply higher-level software to "Big Data".

Learning Outcomes:
After completion of the unit, students should be able to:
1. critically evaluate the features of various programming languages and software packages for data science,
2. explain, relate and accommodate factors affecting complexity, performance, numerics, scalability and deliverability of solutions,
3. implement low-level data science functionality using a relevant programming language (e.g. Python),
4. apply a range of complex analytic methodologies, notably machine learning techniques, using relevant software libraries,
5. assess the applicability and relevance of key "Big Data" software technologies in varied scenarios.

Intellectual skills:
* Algorithmic thinking for data modelling (T,F,A) Practical skills:
* Programming in a relevant language (e.g. Python) and use of associated numeric/scientific libraries (T,F,A)
* Evaluation of scalable software for data applications (T,F,A) Transferable skills:
* Numerical programming (T,F,A)

The first segment will introduce a relevant programming language for data science (e.g. Python): general computing, use of essential libraries for data science (e.g. Numpy, Scipy, Matplotlib, Scikit-learn in the context of Python) and numerical and performance factors underlying.
The second part will cover the use of data structures, database systems, and software technologies for scalability, from the viewpoint of both storage and computation.
Further information on programme availabilityProgramme availability:

CM50267 is Compulsory on the following programmes:

Department of Computer Science