Data Studio

We foster dialogue between data scientists and researchers in clinics and laboratories in order to drive excellence in health care research at Stanford.

About the Data Studio

The Data Studio is a collaboration between Spectrum (The Stanford Center for Clinical and Translational Research and Education) and the Department of Biomedical Data Science. The Data Studio is open to the Stanford community engaged in biomedical research. We expect it to have educational value for students and postdocs interested in biomedical data science. The Data Studio features DBDS faculty and staff who offer the following services: workshops, office hours, and one-to-one consultations. When you complete the Data Studio request form, our coordinator and consultants will work with you to choose the right service for your research project. Appointments may be requested by completing the required form.

Workshops are an extensive and in-depth consultation for a Medical School researcher based on research questions, data, statistical models, and other material prepared by the researcher with the aid of our facilitator. During the Data Studio Workshop, the researcher explains the project, goals, and needs. Experts in the related topic from across campus will be invited and contribute to the brainstorming. After the meeting, the facilitator will follow up, helping with immediate action items and summary of the discussion. Ultimately, we strive to pair each PI with a data scientist for long-term collaboration.Office Hours are brief consultations for Medical School researchers during the last session of each month. DBDS faculty are available to advise about your research questions. Consult the schedule below to complete the Office Hour registration form. Once you have registered, you will receive a calendar invitation with the date, time, and location of the session. Bring any data, prior analyses, or other materials that you have. Our consultants may even recommend your project for a Workshop if it is appropriate.

One-to-one consultations for Medical School researchers are available year-round. Our facilitator assigns each request to a data scientist with the relevant expertise.

Partners

General questions about statistical issues may be brought to the STAT390 Consulting Workshop. This is a class offered by the Department of Statistics during each academic quarter that is staffed by graduate students and directed by a faculty instructor. The service typically consists of a single meeting with the researcher to address a specific concern, such as planning of experiments and data analysis. For more information, consult the STAT390 Consulting Workshop web page.

Researchers who are members of the Stanford Cancer Institute (SCI) conducting research projects related to cancer may request assistance from the SCI Biostatistics Shared Resource.

The Genetics Bioinformatics Service Center (GBSC) offers an end-to-end bioinformatics consulting service (BaaS) that provides high performance computational infrastructure and cutting-edge bioinformatics services for the Stanford community. The team consults on genomics, transcriptomics, proteomics, epigenetics, and metabolomics projects, and also develop custom workflows. For consulting and hands-on bioinformatics help with your projects please reach out to gbsc-baas-team@lists.stanford.edu to set up an initial meeting.

Schedule

The Data Studio is held each Wednesday from 3:00 until 4:30 pm during the fall, winter, and spring quarters of the academic year. Consult the schedule below for the location of each session. Students may participate by enrolling in BMDS 291 for an introduction to the art of statistical consultation and practicum working on projects with a biomedical researcher. All are welcome to attend. Click here to sign up for our mailing list.

The currently scheduled topic is listed below.


TITLE: Precision Identification of Abnormal Tissue Architectures Found in Gulf War Illness Veterans with Gastrointestinal Symptoms

INVESTIGATORS:

  • Elizabeth A. Holman, Department of Microbiology and Immunology
  • Derek R. Holman, Department of Medicine
  • Garry P. Nolan, Department of Pathology

DATE: Wednesday, 20 May 2026

TIME: 3:00–4:30 PM

LOCATION: Room R358, Edwards Building, 300 Pasteur Drive, Stanford, CA

WEBPAGE: https://dbds.stanford.edu/data-studio/

ABSTRACT

The Data Studio Workshop brings together a biomedical investigator with a group of experts for an in-depth session to solicit advice about statistical and study design issues that arise while planning or conducting a research project. This week, the investigator(s) will discuss the following project with the group.

INTRODUCTION

Gulf War Illness (GWI) is a chronic, multi-symptom condition that affects veterans of the Gulf War and subsequent Iraq and Afghanistan conflicts with few available therapeutic options and no validated biomarkers for diagnostics. Extensive literature implicates a complex etiology, with numerous potentially related wartime exposures and risk factors. While much attention has been focused on neurological impacts, chronic gastrointestinal issues are reported in 14–25% of cases, including persistent inflammation.

In our recently completed DoD-funded pilot study on gastrointestinal (GI) symptom-exhibiting GWI veterans, single-cell mass cytometry (CyTOF) was used to demonstrate showed expansions of Th1, Th17, and Tc1 cells, as well as double-negative B cells (IgD⁻CD27⁻) in circulating peripheral blood mononuclear cells (PBMCs). As shown in Figure 1a and 1b, spatial proteomic imaging (CODEX) revealed ectopic B cell follicles and unique NK1R⁺ “starburst” structures in the colonic mucosa. These structures do not appear to be linked to colorectal cancer (CRC), as the affected veterans have neither CRC-associated polyps nor CRC-associated colonic MUC5AC+ epithelium; the NK1R+ starburst architectures also are not observed in healthy controls or inflammatory bowel disease (IBD) patients as depicted in Figure 1c. These findings suggest a previously unrecognized, non-IBD immune pathology that may underlie GWI-associated GI symptoms, though spatial consistency and prevalence of starburst structures across patients is unknown, as is their mechanistic relationship to systemic immune changes.

Our study investigates NK1R⁺ starburst tissue by assessing its prevalence in GWI veterans using spatial proteomic MACSima imaging cyclic staining (MICS) and identifying systemic GWI blood-based correlates via GWI veteran PBMCs using single-cell RNA sequencing and CyTOF (Figure 2). A subsequent grant would analyze associated clinical metadata to identify conserved risk factors including deployment-specific exposures in those with NK1R+ starburst phenotype.

HYPOTHESIS & OBJECTIVE

Our goal is to validate and characterize the immune structures associated with GWI’s GI symptoms that were identified during our DoD-funded pilot study. We hypothesize that GI symptoms in GWI are driven by distinct immune niches that correlate with systemic immune activation as detected by CyTOF. Specifically, we aim to assess NK1R⁺ starburst formation prevalence, define systemic correlates and translational potential, and identify NK1R⁺ starburst-associated risk factors. We will use spatial proteomics, single-cell RNA sequencing (scRNA-seq), and high-throughput immune profiling in veterans recruited at our local Veterans Affairs (VA) in collaboration with our local VA’s War-Related Illness and Injury Study Center (WRIISC) program.

DATASET

We will recruit and phenotype a pilot cohort of GWI (n=10) and non-GWI (n=10) male veterans to determine if the NK1R+ starburst phenotype is GWI-specific rather than veteran-specific while optimizing our sample collection protocols. We will obtain 6 colonic biopsies and 1 15-20mL blood biopsy per patient. The gastroenterologist will sample with the following criteria to increase the likelihood of successfully obtaining at least one colonic proliferating B follicle per patient: (i) biopsies will be acquired from the descending colon, the region closest to the ileum, and (ii) biopsies will be obtained from regions where gut-associated lymphoid tissue is visible by white-light endoscopy.

To phenotype the pilot cohort, we will perform (i) Miltenyi’s MICS spatial proteomic profiling of starburst structures from GWI patient tissue biopsies and (ii) scRNA-seq on PBMCs to identify candidate blood-based cellular biomarkers for intestinal tissue-based GWI NK1R+ starburst architectures. We further propose to identify candidate circulating biomarkers for starburst presence using scRNA-seq, which are then validated using CyTOF. A circulating biomarker would bypass the sampling-associated variability we observe in tissue biopsies, as well as allow for a far less invasive screen. scRNA-seq datasets consist of tables in which each row corresponds to one cell, and each column is one quantified transcript, with ~2k–4k quantified transcripts per cell. CyTOF datasets consist of tables in which each row corresponds to one cell, and each column is one quantified protein marker, with ~20–25 markers per cell.

STATISTICAL MODELS

We are not certain if we are approaching the power analyses with the best methodologies, but we believe that we at least need (i) regression for identifying candidate circulating biomarkers, and (ii) hypothesis tests for validation cohorts.

STATISTICAL QUESTIONS

  1. We expect military controls to be more variable than civilian controls, since they are exposed to many environments that civilians never encounter.
    1. How do I address power analysis when we only have data from GWI veterans and civilian controls with no data on military controls?
  2. We were asked to provide more information regarding outlier exclusion for our systemic biomarker assessment but have concerns that our patient cohort will have a much higher variability than the control group due to additional health complications.
    1. Is it appropriate to assess outlier status based on a limited subset of associated variables with known links to a phenotype of interest?
    2. Should outlier status be determined based on all collected variables?
  3. We are screening for potential diagnostic criteria using X of thousands of biomarkers from scRNA-seq data. Based on this data, we then select 5–10 markers for validation using CyTOF.
    1. Which multiple comparisons corrections are used to mitigate the risk of false positives when performing numerous statistical tests simultaneously?
    2. To what extent are they necessary if the statistical test is used for hypothesis generation followed by hypothesis testing, rather than for drawing direct conclusions?
    3. Do the corrections change if one is combining multiple variables into a single aggregate variable?

ZOOM MEETING INFORMATION

Join from PC, Mac, Linux, iOS or Android: https://stanford.zoom.us/j/92414292941?pwd=9sQzfFbJpC71PyS5kRofKsT86nEWD9.1

    Password: 124320

Or iPhone one-tap (US Toll): +18333021536,,92414292941# or +16507249799,,92414292941#

Or Telephone:

    Dial: +1 650 724 9799 (US, Canada, Caribbean Toll) or +1 833 302 1536 (US, Canada, Caribbean Toll Free)

    Meeting ID: 924 1429 2941

    Password: 124320

    International numbers available: https://stanford.zoom.us/u/aMbftTO9

    Meeting ID: 924 1429 2941

    Password: 124320

    SIP: 92414292941@zoomcrc.com

    Password: 124320