4-3. BioIndex

To use a BioIndex database as a data source, select BioIndex in the Data Type drop-down menu.

The BioIndex is a tool that indexes genomic data stored in AWS S3 "tables" (typically generated by Spark) so that they can be rapidly queried and loaded. It uses a MySQL database to store the indexes and to look up where in S3 each "record" is located.

It is the tool that all of the Knowledge Portals developed by the HuGeAMP project use to load data to their pages. BioIndex is an open-source project, therefore free to anyone to try. (You can learn more about it here: https://github.com/broadinstitute/dig-bioindex)

If a registered user has large-scale data such as GWAS summary statistics, the research portal development team can help to set up a BioIndex for the data so they can be quickly loaded to the research portal. 

Here is an example of a configuration for BioIndex as data point.