Stemformatics is built as 2 seperate applications: an API server which hosts all the data, and an UI server which hosts the website. Some details about how the system has been designed can be read below.
Mongo db mainly holds dataset metadata and sample metadata in two separate collections. Expression files are held mainly in text files, but also duplicated in hdf5 files for performance reasons. Note that most text files below certain size can be read very quickly by pandas (see this useful article), but hdf5 files are useful for analyses which require looping through multiple expression matrices quickly).
The API server is built on Flask-restful. It uses pandas extensively to manipulate data frames. Models contain data models which interface the underlying data to fetch them in suitable formats, and resources contain "controllers" which interface between the data models and the API endpoints. See our API page for more details.
Both the API server and the UI server code can be found at github: