Parfu is an MPI application that works very similarly to unix tar. It archives directories of files into a single archive file, and then it extracts the directories and files back out later. It can be run with any arbitrary number of ranks. It goes faster as you add more ranks.
Parfu works like tar (although parfu archive files are not tar-compatible, yet) but it works much faster than tar or any of its quasi-parallel bretheren (pigz, ptar, etc.). We created to archive directory trees with extremely numerous (millions of) files that tend to wreak havoc with parallel tape storage systems. (The millions of files store very efficiently, but get fragmented across many many tapes, so retrieval is very slow to impossible.) Parfu runs fast enough that when a workflow produces output like this, the resulting directory structure can be archive and then the resulting archive file can be moved to tape and retrieved quickly and efficiently, and then parfu can extract the archive to restore the original files. Bioinformatics workflows often produce output directories like these.
Parfu has been released under the University of Illinois Open-Source License. Source code is available on the NCSA github page.
For questions about Parfu, please contact me at firstname.lastname@example.org.
All rights reserved. ©2016 Board of Trustees of the University of Illinois.