These files encapsulate raw signal data generated during nanopore sequencing, enabling researchers to perform basecalling, detect methylation patterns, and develop custom bioinformatics workflows.
FAST5 Files
Before diving into where to find FAST5 files, it’s important to understand what they are. FAST5 is an HDF5-based file format developed by ONT. Each file stores signal-level data produced as DNA or RNA molecules pass through nanopores during sequencing. Unlike FASTQ or BAM files, which store processed reads and alignments, FAST5 files contain raw signal traces and metadata, providing access to the most fundamental level of sequencing data.

There are two primary formats: single-FAST5 and multi-FAST5. In single-FAST5 format, each read is stored in a separate file. In multi-FAST5. multiple reads are bundled into a single file, improving efficiency in storage and data handling.
1. Oxford Nanopore Technologies (ONT) Official Channels
If you’re looking for authentic and comprehensive FAST5 datasets, ONT is the primary and most reliable source. They offer access through the following means:
a. MinION and Other ONT Devices
The most direct way to obtain FAST5 files is by running a sequencing experiment using ONT platforms such as MinION, GridION, or PromethION. When you initiate a sequencing run with these devices, they output FAST5 files by default. The ONT MinKNOW software handles data acquisition, and the sequencing summary and raw data files are automatically stored in the specified output directories.
b. EPI2ME Labs and ONT Developer Resources
ONT offers training datasets and interactive tutorials via EPI2ME Labs, which are ideal for beginners. These datasets include FAST5 files along with other formats and come with guides for analysis. While not as extensive as a full experimental run, these datasets are curated to introduce users to typical challenges and analysis workflows in nanopore sequencing.
c. ONT Community Forums
The ONT Community (https://community.nanoporetech.com/) often hosts shared datasets by users or developers participating in beta tests or educational initiatives. While access might be limited to members, it’s worth joining if you’re serious about nanopore sequencing.
2. Public Repositories and Data Archives
For users who do not have access to an ONT device, public repositories are a goldmine. These repositories offer real-world sequencing datasets that include FAST5 files.
a. European Nucleotide Archive (ENA)
ENA is one of the largest genomic data archives and includes datasets from ONT sequencers. It supports downloads of associated FAST5 files if the data submitter has made them available. Users can search by accession numbers, organism, or sequencing platform.
Example:
The “human-genome” dataset ONT released includes thousands of FAST5 files hosted on AWS
Use AWS CLI or S3 browsers to download the datasets directly
3. GitHub Repositories and Bioinformatics Projects
Although not as common as official archives, GitHub is another useful source for example FAST5 files, especially from bioinformatics researchers developing open-source tools. These datasets are usually small (for testing purposes) but can be extremely useful when validating software.
Some projects that have included FAST5 datasets:
nanopolish
bonito
megalodon
deepnano-blitz
Be cautious: these are not suitable for full-scale analyses but can help you get a feel for FAST5 file structures.
4. Academic Institutions and Research Labs
Many universities that use nanopore technology for research make datasets available via institutional repositories or data-sharing portals. These datasets are often linked in the “Data Availability” section of research articles.
If a paper mentions FAST5 usage:
Check supplementary materials or contact the authors
Visit associated lab websites (e.g., genomics labs at MIT, Oxford, UCSC)
Explore institutional data repositories such as Harvard Dataverse or Stanford Digital Repository
Some universities also host courses in genomics that offer hands-on datasets, including FAST5 files, for registered students.
5. Cloud-Based Analysis Platforms
Platforms such as DNAnexus, Terra.bio, and Galaxy occasionally host nanopore datasets as part of training modules or shared user workspaces. While you may need to create an account or request access, these platforms often provide access to sample FAST5 data preloaded in cloud environments.
Benefits of cloud-hosted FAST5 data:
No local storage needed
Integrated analysis tools (e.g., Guppy, Minimap2. Nanopolish)
Ideal for testing workflows or teaching bioinformatics
6. Workshops, Webinars, and Conferences
FAST5 datasets are frequently shared during nanopore-focused workshops and webinars. Oxford Nanopore itself hosts events like Nanopore Community Meetings, where participants receive curated datasets for exercises. These can include full sequencing runs or subsets designed for teaching specific concepts such as basecalling or methylation calling.
Search platforms:
ONT events page
Academic conference websites
YouTube channels from university seminars
Often, datasets linked to these events are only temporarily available, so it’s best to download them promptly if offered.
7. Educational Courses and MOOCs
Massive Open Online Courses (MOOCs) focused on genomics, offered by platforms like Coursera, edX, or FutureLearn, may include access to nanopore datasets. Courses affiliated with ONT or large research projects may provide access to sample FAST5 files for lab sessions.
Benefits include:
Structured learning around the data
Access to tutors or community forums
Exposure to real-world sequencing challenges
8. Third-Party Blogs and Community Portals
Experienced users and independent researchers sometimes share test FAST5 files on blogs or discussion platforms such as Reddit, Stack Overflow (Bioinformatics section), or BioStars.
Examples:
A blog post comparing Guppy vs. Bonito might include download links
GitHub Gists with small FAST5 bundles for testing purposes
Bioinformatics Q&A threads where users request and share test files
While convenient, it’s important to verify that these files are ethically sourced and comply with any licensing or privacy restrictions.
9. Simulated FAST5 Files
In cases where you can’t obtain real FAST5 data, simulated datasets can be generated using tools like DeepSimulator or NanoSim. These synthetic files mimic the structure and signal patterns of actual nanopore reads, useful for testing algorithms or training AI models.
Advantages:
Customizable parameters (error rate, genome, signal noise)
No ethical concerns about patient data
Great for reproducibility in scientific software development
10. Direct Collaboration with Labs or Institutions
If your work requires access to specific types of FAST5 files (e.g., from a rare organism or particular tissue), reaching out directly to research labs or data owners is often fruitful. Researchers are usually open to collaboration, especially if you’re contributing to open science or academic research.
Tips for success:
Be clear about your purpose and intended use
Offer to cite or acknowledge the data provider
Use institutional email or official channels
Many datasets are not publicly shared simply due to size constraints or lack of hosting options—not because of unwillingness.
Conclusion
FAST5 files are essential for anyone working with Oxford Nanopore sequencing data, whether your goal is to explore signal-level analysis, optimize basecalling, or develop machine learning models in genomics. Fortunately, there is a wide array of sources—from official ONT platforms to public archives and academic repositories—that offer access to real or simulated FAST5 files.
About us and this blog
Panda Assistant is built on the latest data recovery algorithms, ensuring that no file is too damaged, too lost, or too corrupted to be recovered.
Request a free quote
We believe that data recovery shouldn’t be a daunting task. That’s why we’ve designed Panda Assistant to be as easy to use as it is powerful. With a few clicks, you can initiate a scan, preview recoverable files, and restore your data all within a matter of minutes.
Subscribe to our newsletter!
More from our blog
See all postsRecent Posts
- instant flash no image file detected 2025-05-19
- How to copy file to usb flash drive 2025-05-19
- How to undelete files on a flash drive 2025-05-19