2 Analytical Platform
2.1 Introduction
To gain an overview of the Analytical Platform (AP) watch this 2-3 min introductory video, although please be aware that it’s a few years old and much has changed since then. For more depth please see:
- The Introduction to the Analytical Platform (link to recent recording), given as part of the Analysis Directorate Induction Academy.
- The Introduction to using R on the Analytical Platform - see the internal R training section for more details.
- The Analytical Platform User Guidance.
2.2 Summary of key terms
It will help you to be familiar with the following key terms:
- Analytical Platform (AP): A data analysis environment providing modern tools and key datasets for MoJ analysts. AP contains training documents, resources, and access to various analytical software such as Rstudio and Jupyter.
- Control Panel: A place to navigate to Rstudio, Jupyter, S3 Buckets etc
- RStudio: Development environment for writing R code and R Shiny apps
- JupyterLab: Development environment for writing Python code including Python notebooks
- Git: Version control software that enables multiple people to make separate changes at the same time.
- GitHub: A web-based interface that uses Git and on which you publish and share your version-controlled code. You use Git locally (e.g. using RStudio) to track versions of your code, and then submit those changes to Github.
- GitHub Repositories (Repo): Broadly similar to setting up a project folder on DOM1 shared drive to save work and share with others. Files on Github Repos represent the definitive version of the project. Everyone who works on the project makes contributions from their own personal versions.
- Amazon S3: A web-based cloud storage platform for storing data. Access to amazon S3 buckets can be managed.
- Slack: Collaboration tool where you can get technical support for Analytical Platform tools such as R, Python, Git. You can share knowledge, submit admin requests and communicate quickly with other AP users.
2.3 Getting set up
Follow the steps in the Analytical Platform User Guidance Quickstart Guide.
2.4 Managing data
Once you have got set up on the Analytical Platform, do read about the following data management/handling topics in the Analytical Platform User Guidance:
- How data are held on the Analytical Platform and finding the data you need. You can read about the three different data storage options (Amazon S3, Curated databases and home directories).
- Working with Amazon S3, data FAQ, the Data Uploader tool and interacting with Amazon S3 via the Analytical Platform.
- Information governance procedures to be followed.
- Data retention policies including when deleting data means they are permanently deleted.