API access

All Stemformatics data can be accessed via API. Use your favourite tool, such as python or R to search for relevant datasets and download them to do your own analyses. Or simply go to the api server and view the data in raw form.


# cURL example
curl https://api.stemformatics.org/datasets/2000/metadata

# python examples
import pandas, requests
r = requests.get('https://api.stemformatics.org/datasets/2000/samples')
df = pandas.DataFrame(r.json())
print(df.head())
        sample_id    cell_type parental_cell_type  ... developmental_stage treatment external_source_id
0  2000_1787466030_H  neurosphere         epithelium  ...
1  2000_1787466065_A  neurosphere         epithelium  ...
2  2000_1787466030_E  neurosphere         epithelium  ...
3  2000_1787466065_D  neurosphere         epithelium  ...
4  2000_1699538158_H  neurosphere         epithelium  ...

# Note that you can safely use spaces inside query string variable and requests will parse it for you
r = requests.get('https://api.stemformatics.org/search/samples?query_string=%s&field=tissue_of_origin,dataset_id' % 'dendritic cell')
print(r.json()[:2])
[{'sample_id': '7277_GSM2067549', 'dataset_id': 7277, 'tissue_of_origin': 'umbilical cord blood'}, 
    {'sample_id': '7277_GSM2067548', 'dataset_id': 7277, 'tissue_of_origin': 'umbilical cord blood'}]

# To get expression matrix as file but read it into pandas directly
import io
r = requests.get('https://api.stemformatics.org/datasets/6756/expression?as_file=true')
df = pandas.read_csv(io.StringIO(r.text), sep='\t', index_col=0)
print(df.head())
            GSM741192.CEL  GSM741193.CEL  GSM741194.CEL  GSM741195.CEL  \
1415670_at         8.209027       8.262415       8.557468       9.205204   
1415671_at        10.852328      11.100999      10.912304      10.836298   
1415672_at        10.431524      10.364212      10.517259      11.122440   

# R example
library(httr)
library(jsonlite)
response = GET("https://api.stemformatics.org/datasets/2000/metadata")
print(content(response))
    
Full list of APIs

Where you see the parameters, default vaules are given and these can be left out. For example, /datasets/2000/samples will work the same as /datasets/2000/sample?orient=records&as_file=false. If default value is not given, it is a required parameter and this is explained.

The parameter 'orient' can have same values as specified by to_dict() function in python pandas package.

/datasets/{dataset_id}/metadata
/datasets/{dataset_id}/samples?orient=records&as_file=false
/datasets/{dataset_id}/expression?gene_id={Ensembl_gene_id}&key=cpm&log2=false&orient=records&as_file=false
/datasets/{dataset_id}/pca?orient=records&dims=20
/datasets/{dataset_id}/correlated-genes?gene_id={Ensembl_gene_id}&cutoff=30
/datasets/{dataset_id}/ttest?gene_id={Ensembl_gene_id}&sample_group={sample_group}&sample_group_item1={item1}&sample_group_item2={item2}
/search/datasets
/search/samples?limit=50&orient=records
/values/datasets/{key}?include_count=false
/values/samples/{key}?include_count=false
/download?dataset_id={comma separated dataset ids}
/genes/sample-group-to-genes?sample_group={sample_group}&sample_group_item={sample_group_item}&cutoff=10
/genes/gene-to-sample-groups?gene_id={Ensembl_gene_id}&sample_group=cell_type
/atlas-types
/atlases/{atlas_type}/{item}?version=''&orient=records&filtered=false&query_string=''&gene_id=''&as_file=false
/atlas-projection/{atlas_type}/{data_source}
© 2021 Stemformatics
Hosted at Nectar logo