quay.io/microbiome-informatics/emg-notebooks.dev:v1.2.28888lab$__history_id__$__galaxy_url__8080$__galaxy_url__`_'s datasets using Python or with R using the `MGnifyR `_ package.
Why such notebooks?
-------------------
The quantity and richness of `metagenomics-derived data `_ in MGnify grows every day. The `MGnify website `_ is the best place to start exploring and searching the MGnify database, and allows users to download modest query results as CSV tables.
For larger queries, or more complex requirements like fetching metadata from samples across multiple studies, a programmatic access approach is far better.
Programmatic access - fetching data from MGnify using a terminal command or code script - uses the `MGnify API `_ (`Application Programming Interface `_). The API provides access to every data type in MGnify: `Studies `_, `Samples `_, `Analyses `_, `Annotations `_, `MAGs `_ etc: it is what lies behind the MGnify website. Using the API means you can fetch more data than is possible via the website, and can help you write reproducible analysis scripts.
The API can be explored interactively online, using the `API Browser `_. But actually using the API first requires knowledge and/or installation of tools on your computer. This might range from a command line tool like `cURL `_, to learning R and setting up the `R Studio `_ application, to setting up a `Python `_ environment and installing a suite of `packages used for data analysis `_. Second, the API returns most data in `JSON format `_: this is standard on the web, but less familiar for bioinformaticians used to TSVs and dataframes.
The `MGnify Notebook Server `_ and `MGnifyR package `_ are designed to bridge these gaps. Users can launch an online R and Python coding environment in their browser, without installing anything. The environment is hosted by EMBL's `Cell Biology and Biophysics Computational Support team `_, who support computational projects across EMBL. It already includes the main libraries needed for communicating with the MGnify API, analysing data, and making plots. It uses the popular `Jupyter Lab `_ software, which means you can code inside `Notebooks `_: interactive code documents.
There are example Notebooks written in both R and Python, so users can pick whichever they're more familiar with.
]]>