首页 > > 详细

EEL6935: R、R语言程序解析、讲解R程序、R调试、辅导留学生R设计、R语言程序解析

EEL6935: Big Data Ecosystems, Spring 2018
Assignment 3
Transcription Factor Binding Prediction

• Individual Submision
• Submit via Kaggle and eLearning

1. Description:
Transcription is the proces where a gene's DNA sequence is copied (transcribed) into an RNA molecule.
Transcription is a key step in using information from a gene to make a protein.
When a gene to be transcribed, the enzyme RNA polymerase, which makes a new RNA molecule from a
DNA template, must atach to the DNA of the gene. It attaches at a spot called the promoter. In human
RNA polymerase can atach to the promoter only with the help of proteins caled basal transcription factors.
They are part of the cel's core transcription tolkit, neded for the transcription of any gene. Some
transcription factors activate transcription, but others can repres transcription.
The binding sites for transcription factors are often close to a gene's promoter. However, they can also be
found in other parts of the DNA, sometimes very far away from the promoter, and stil affect transcription
of the gene. Binding of transcription factors to transcription factor binding sites (TFBSs) is key to the
mediation of transcriptional regulation. Information on experimentaly validated functional TFBSs is
limited and consequently there is a ned for acurate prediction of TFBSs for gene anotation and in
applications such as evaluating the effects of single nucleotide variations in causing disease.
In this programing asignment, students are required to predict the TFBSs with deep learning approach.
This dataset includes SP1 transcription factor binding and non-binding sites on human chromosome1. There
are 1000 sequences for binding sites and 1000 sequences for non-binding sites. Students ned to clasify
sequences with 1 for TFBSs or 0 for non-TFBS. Each sequence is 14 nucleobase length. More details about
the asignment are post on the competition website.
All questions should be posted on Asana at https:/app.asana.com/0/537909537550082/556731234230161 .
2. Setup
The folowing URL can help you aces to the Kagle InClas Prediction Competition.
https:/ww.kaggle.com/t/8dd9d683ccfd4a2f93053209406505da
Participation of this assignment is restricted to those with aces to the preceding link.
All the related information about this asignment is post on the competition site. If you are not familiar with
Kagle platform, please refer to the webpage https:/ww.kaggle.com/wiki/Home.
Please folow the instruction to submit your prediction results and keep improving your model iteratively.
Please note: as a standard practice, to avoid participants gaming the system, the public leaderboard is
calculated with partial samples in the test data. The conclusive results wil be based on the rest samples, so
the final standings may be different.
3. Submision Requirements
• Students should submit your prediction results to the Kagle competition online and get at least
one valid score on the public leaderboard.
• In adition, students should submit a project report and a zip file of al your codes on the
eLearning. In the report, you should describe your data procesing, prediction model, result
analyze and any interesting finding of your experiments in the proces.
Please also atach your Kagle team name and the Github link of your scripts in the project report.

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!