Final Project

CBIO (CSCI) 4835/6835: Introduction to Computational Biology

This is an opportunity for you to combine a problem in biology you find interesting with some of the concepts you've learned in CBIO 4835/6835.

Put another way, it's an opportunity for you to inflict pain and suffering on your instructor!

Ideally you will come up with something related to your own research that includes real data, but if you need project ideas I can provide them. You may use any Python packages as long as they can be installed with a package manager (pip or conda).

You are required to provide:

  • A writeup with sufficient detail for a student to understand and implement the assignment in a common word processing format (Word, HTML, LaTeX, Jupyter notebook).
  • The python code to the solution. The solution should require 20-100 lines of code.
  • At least three example inputs including the necessary commandline to get the output. Make sure it is okay to publicly release the data and include any appropriate acknowledgements/citations.

Some General Guidelines

You are allowed to work either individually or in pairs. Be aware that if you work with a partner, you will both receive the same grade and your team will be expected to accomplish more than if you were working alone.

The amount of work should be around 1 to 1.5 homework assignments; for 2 people, this would obviously double. That said, don't overextend yourselves!

You may not 'recycle' papers or reports that you wrote for some other class or as part of a research project.

A project is required for students in 6835, but optional for students in 4835. Anyone who does the project is exempt from Assignment 6, and students in 4835 can do the project for extra credit!

There are three components to the final project; they are as follows.

1: Proposal

Due Friday, April 7 at 11:59pm

Proposals should be no more than 1 page (not including references) and should contain the following:

  • Background. Introduction that describes the biological context of your question and any previous work.
  • Goal. A succinct state of the question that you will address and how you will use computational tools to address it.
  • Approach. Description of what work you will perform and what tools you will use. Be as specific as possible here. Think in terms of the figures that you might obtain as results.
  • Possible pitfalls. Describe problems that you anticipate and possible strategies for overcoming them. For example, what if the simulation tool you are using doesn't provide facility for an analysis you need to perform? How will you determine if your results are "correct"?
  • References. Provide references to all papers that are providing, background, models, or data that you are using for your project.

2: Presentation

Due Tuesday, April 25 (last day of lecture)

The presentation should be about 15 minutes (followed by 3-5 minutes of questions) per project.

They should accurately summarize the problem and its background as put forth in your proposal, as well as demonstrate the progress you have made.

Your project doesn't have to be 100% complete. But DO NOT misconstrue this--it should be AT LEAST 80% finished. Maybe you're waiting on one more experiment, but by far the bulk of the work should be finished at this point.

3: Write-up

Due Tuesday, May 2 at 11:59pm

This is the write-up alluded to at the beginning. It can take any readable format you'd like (e.g. Word, Jupyter notebook) but should be readable and understandable to fellow students.

You don't have to follow this structure exactly, but a fail-safe organizational strategy would involve:

  • Introduction: Here is where you will establish what questions you are asking and why they are important. You will state concretely the objectives of your work. You will also provide sufficient background that a general scientific reader can understand what you are doing and why.
  • Related Work: You will describe briefly and give relevant citations to other work that addresses the specific systems and mechanisms you are investigating. If you are building on an existing model, you should cite that model and give a brief description here, or possibly in the Introduction.
  • Methods: You will describe the methods, code, and software that you used to develop the model and simulations results you are presenting.
  • Results: Here you will describe the results of your simulations and other analyses. If you developed a new model you can also describe the content of that model in the first section of Results (you would describe the code and software though in Methods). For example, if you developed an ODE model using SciPy, you would briefly mention and cite SciPy and ODE's in the Methods section, but present the actual model--species and reactions--in the Results section.
  • Discussion: Discuss and interpret your findings, paying particular attention to any results that you think may be incorrect with an explanation of what you think might have gone wrong and how you would fix it going forward.
  • Conclusions and Future Work: Summarize and state whether you did or did not achieve the initial goals of the project.
  • References: Be sure to include complete list of authors and the title of the paper.

You should also include any code or data; if you want to use GitHub to store your code (where all the materials for this course are hosted), please let me know and I will create a repository for you.

Some [potential] Project Titles

  • Agent-based Modeling of the Zombie Apocalypse
  • Multiple sequence alignment for predicting structure from sequence
  • A Network Approach to Global Scale Epidemic Spreading - Ebola Case Study
  • Analysis of the kinetics of Influenza A infections in humans via mathematical modeling
  • Automated cell counting and tracking
  • Simulating evolution of yeast pheromone signaling pathway
  • Modeling of rapid activation of an apoptosis pathway in response to DNA damage
  • Pharmacological analysis of a dynamic apoptosis-autophagy model

Some Starting Points

  • Nature, Science, PLOS Biology
  • Cell
  • Molecular Systems Biology, Science Signaling, Nature Systems Biology
  • PLOS Computational Biology
  • Biophysical Journal
  • Interface
  • Molecular Biosystems
  • BMC Systems Biology
  • PLOS One
  • arxiv.org/q-bio (A lot of interesting work appears here in preprint form)

Questions?

Post in the Slack #questions channel!