首页 > > 详细

辅导 Week 2 Tasks – Single-Cell RNA-seq Project讲解 Java语言

Week 2 Tasks - Single-Cell RNA-seq Project

This week's focus is on data normalization and feature selection. These steps prepare the dataset for downstream clustering and annotation by reducing technical variation and focusing on informative genes.

Objectives

- Normalize raw gene expression counts

- Log-transform. the data

- Identify and visualize highly variable genes (HVGs)

- Explore effects of normalization through visualizations (subsample if necessary)

- Document insights and prepare a summary report

Tasks

1.   1. Normalize and Log Transform the Data

•       - Use scanpy's normalize

•       - Log-transform. the data (log1p).

•       - Store the raw counts using `adata.raw = adata`.

2.   2. Identify Highly Variable Genes (HVGs)

•       - Use scanpy’s highly variable genes function (adata, n_top_genes=2000)`.

•       - Filter to retain only HVGs

•       - Plot HVG selection (scanpy has a function for this) .

3.   3. Subsample Cells for Visualization (if necessary due to computational constraints, if not use the entire dataset)

•       - Randomly select ~10,000 cells using numpy

•       - Create a new AnnData object for plotting: `adata_sub = adata[subset_idx].copy()`.

4.  4. Plot Normalization Effects

•       - Violin plots of `total_counts` and `n_genes_by_counts

•       - Scatter plot

•       - Plot top expressed genes. Submit Work

•       - Clean Jupyter Notebook (`.ipynb`) with markdown annotations.

•       - 1-page summary report explaining what was done and observed.

•       - Include visuals and key takeaways about normalization and HVG selection.

Time Estimate

- Total: ~15 hours

- Normalization & Log1p: 3-4 hrs

- HVG analysis: 3-4 hrs

- Visualization with subsampling: 3-4 hrs

- Documentation and reflection: 2-3 hrs


联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!