Syllabus

Course Description

Teaches important concepts and skills for statistical software development using case studies. After this course, students will have an understanding of the process of statistical software development, knowledge of existing resources for software development, and the ability to produce reliable and efficient statistical software

Prerequisites - strictly enforced

Instructor

Dr. Naim Rashid
Associate Professor
Department of Biostatistics


Office: 20-020 Lineberger
email: naim[at]unc.edu
phone: 919-966-8150
web: https://naimurashid.github.io/

Graders

Yiran Song (yrsong [at] email.unc.edu)
Xinyu Zhang (xinyuz16 [at] unc.edu)

Office Hours

Dr. Rashid’s office hours: (just see me after class)
Grader office hours: TBD

Course Website and Schedule

https://biodatascience.github.io/statcomp/

Class Days, Times, Location

1:25-3:10 MW, MCG 1305

Course Overview

This class teaches important concepts and skills for statistical computing, numerical optimization, and machine learning using case studies. After this course, students will have a good understanding of the process of producing high-quality and sharable statistical programs (module 1, 3 weeks), algorithms for optimization and numerical integration (module 2, 6 weeks), and will be able to implement and apply some common and powerful machine learning methods (module 3, 3 weeks). Modules 1 and 3 were originally developed by Dr. Michael Love.

Course Format

The course format will consist of bi-weekly lectures with optional readings (specified in the the lecture notes). Lectures are supplemented with in-class exercises, group activities, weekly homework, and projects.

Required Readings

There are no required texts for this course. Any texts referenced in class lecture are either published online as open-access material or are available as free E-books through UNC libraries.

Course-at-a-glance

The instructor reserves the right to make changes to the syllabus, including topics, readings, assignments, and due dates. In particular, availability of guest speakers may alter the course schedule. Any changes will be announced as early as possible. For the most recent session-by- session schedule and assignments, please see the course schedule at the following link: https://biodatascience.github.io/statcomp/

A general overview of topics is provided below.

Module 1 – Statistical Computing for Scientific Research (3 weeks)

  • Writing readable and efficient R code
  • Building, documenting, testing, and submitting an R package
  • Rcpp for writing C++ code called from R
  • Techniques for handling large datasets

Module 2 – Optimization and Numerical Integration (6 weeks)

  • General Optimization (1.5 weeks)
    • Newton Raphson, fisher scoring, IRLS
    • BFGS, Nelder-Mead, gradient descent
  • EM Algorithm (1 week)
    • EM, ECM
    • Other variants: SEM, MCEM, Rejection Controlled EM
  • General Numerical integration (1 week)
    • Gaussian Quadrature
    • Monte Carlo Integration
    • Importance Sampling
  • General Markov Chain Monte Carlo methods (1.5 week)
    • Direct and rejection sampling
    • Gibbs sampling
    • Metropolis Hastings
  • Advanced MCMC (1 week)
    • Metropolis within Gibbs + adaptive variants
    • Hamiltonian MCMC
    • Scaling up: Variational Inference

Module 3 - Machine Learning Topics (3 weeks)

  • Machine learning essentials
    • training and test sets
    • feature selection
    • parameter tuning
    • cross-validation
  • Introductions to:
    • Support Vector Machines (SVM)
    • Random Forests (RF)
    • Neural Networks (NN)

Course Assignments and Assessments

The class will be taught through three modules. One final project (30%), an initial proposal for the final project (10%), a final presentation (20%), class participation (10%), and weekly homework assignments (30%) will make up the final grade for the course. The initial proposal (due March 10th) can be resubmitted for a regrade as many times as desired up until March 31st.

At the beginning of the course, students will learn to create an R package which they will update as the class progresses, implementing the methods they learn in each module. This R package will be applied to each project and homework assignment. Each student’s R package will be hosted and iteratively updated on GitHub. Homework and projects will be similarly submitted to course instructors through GitHub Classrooms.

Each student will also be assigned a Virtual Machine (VM) that contains all necessary software to run the course notes, examples, homework assignments, and class projects. Students are also encouraged to install required course software locally on their own computers, where the VM provides an alternative if installation issues arise. Instructions for how to log into the VM will be sent out the first week of classes. The VM instances are hosted on bios department servers, and any issues should be forwarded to bios IT.

Course Grading Scale:

  • H: Clear excellence
  • P: Entirely satisfactory
  • L: Low passing
  • F: Fail

The School of Public Health grading system is designed so that the mode of the grading distribution is P. The last graded assignment will be due on the last week of regular classes.

Homework policy - students are allowed and encouraged to talk about ideas and approaches to homework problems in groups, though students should write up the code for the assignments independently. Copying someone else’s work is always an honor code violation. Expulsion from the university is possible if the honor code is violated, and receiving 0% on the assignment in question is a certainty. Late work will be penalized at 10% per week late.

Learning objectives

  • How to write efficient and sharable statistical programs for scientific research
  • Various optimization and numerical integration algorithms
  • How to implement and use common and powerful machine learning methods

Expectations, Policies, and Procedures

Accessibility at UNC Chapel Hill

UNC-CH facilitates the implementation of reasonable accommodations, including resources and services, for students with disabilities, chronic medical conditions, a temporary disability, or pregnancy complications resulting in barriers to fully accessing University courses, programs and activities. Accommodations are determined through the UNC Office of Accessibility Resources & Services (ARS), https://ars.unc.edu/; phone 919-962-8300; email . Students must document/register their need for accommodations with ARS before accommodations can be implemented.

Attendance and Participation

Your attendance and active participation are an integral part of your learning experience in this course. If you are unavoidably absent, please notify the course instructor (and Teaching Assistant if one is assigned).

Community Standards in Our Course and Mask Use.

UNC-Chapel Hill is committed to the well-being of our community – not just physically, but emotionally. The indoor mask requirement was lifted for most of campus on March 7, 2022. If you feel more comfortable wearing a mask, you are free to do so. There are many reasons why a person may decide to continue to wear a mask, and we respect that choice. For additional information, see Carolina Together.

Counseling and Psychological Services at UNC Chapel Hill

CAPS is strongly committed to addressing the mental health needs of a diverse student body through timely access to consultation and connection to clinically appropriate services, whether for short or long-term needs. Go to their website: https://caps.unc.edu, call them at 919-966-3658, or visit their facilities on the third floor of the Campus Health Services building for a walk-in evaluation to learn more.

Use of Generative AI

Instructions: A statement regarding the use of AI is required on all syllabi. Below the required language, you may select from the following examples (provided for your convenience) or create your own. Use of AI is at the discretion of the instructor and may even vary across assignments. Instructors have the responsibility to clearly communicate expectations to students. You can refer to the UNC Generative AI Syllabus Guidelines to help you customize a statement.

Generative artificial intelligence (AI) tools (e.g., ChatGPT) that generate text, images, and media, could aid brainstorming, research, and content creation, and may be useful in public health practice. However, these tools must be used ethically, transparently, and with the understanding of their limitations including circumstances when AI use hinders rather promotes learning.

In this course, Gen AI cannot be:

  • Used as a replacement for doing the assigned course readings
  • Used solely for the output for completing mathematical computations
  • Used solely for the output for submitting written work
  • Used for cheating or to gain unfair advantages

If you have any questions, please contact me. I reserve the right to submit written assignments to AI detection programs (e.g., iThenticate). Suspected violations will be reported to the University Honor Court.

Unless I provide other guidelines for an assignment or exam, you should follow these guidelines: Students are expected to follow the policy provided by the UNC Generative AI committee. Briefly, this policy allows the use of AI for a variety of tasks (including topic selection, brainstorming, research, source valuation, outlining, drafting, media creation, revising, and polishing). Importantly, when you use Gen AI in your work, you must document it as specified at the website above. This policy is at your instructor’s discretion and may be modified with written notice for specific tests and assignments.

Course Communication Expectations

Students must maintain course communications (e.g., email, course announcements, course discussions, etc.) with their peers and instructor(s) to be successful in this course. You are expected to check, read, and respond when necessary to your course communications regularly (i.e., at least two times during the business week). Not reading email is an unacceptable excuse for missing course communications.

Student well-being is my/our primary concern. If I/we send you a communication that warrants a response and do not hear back from a you after following up twice, I/we will submit a Gillings School Graduate Student Early Alert Referral to Academic Coordinator Form. To ensure you have the support needed to be successful in this program, your academic coordinator, faculty mentor, assistant dean of master’s degree programs, associate dean for student affairs, and/or dean of students may get involved if non-responsiveness becomes a significant concern.

All UNC affiliates (including students, faculty, and staff) must use their University email account to conduct UNC business. Use of personal email addresses, including auto-forwarding to external/personal accounts, is not allowed for conducting University business. For more information, see the Individual Email Address Policy.

Honor Code

As a student at UNC Chapel Hill, you are bound by the university’s Honor Code, through which UNC maintains standards of academic excellence and community values. It is your responsibility to learn about and abide by the code. All written assignments or presentations (including team projects) should be completed in a manner that demonstrates academic integrity and excellence. Work should be completed in your own words, but your ideas should be supported with well-cited evidence and theory. If you have any questions about your rights and responsibilities, please consult the Office of Student Conduct (https://studentconduct.unc.edu/) or review the following resources: * Honor System https://studentconduct.unc.edu/honor-system * Honor system module https://studentconduct.unc.edu/students/honor-system-module * UNC Library’s plagiarism tutorial https://guides.lib.unc.edu/plagiarism * UNC Writing Center’s handout on plagiarism https://writingcenter.unc.edu/tips-and-tools/plagiarism/

Inclusive Excellence

We are committed to expanding diversity and inclusiveness across the School — among faculty, staff, students, on advisory groups, and in our curricula, leadership, policies and practices. We measure diversity and inclusion not only in numbers, but also by the extent to which students, alumni, faculty, and staff members perceive the School’s environment as welcoming, valuing all individuals, and supporting their development.

For more information about how we are practicing inclusive excellence at the Gillings School, visit our Diversity and Inclusion webpages: * Diversity and Inclusion: https://sph.unc.edu/resource-pages/diversity/ * Minority Health Conference: http://minorityhealth.web.unc.edu/ * National Health Equity Research Webcast: https://sph.unc.edu/mhp/nat-health-equity-research-webcast/

Land Acknowledgement

Please read The Gillings School’s Land Acknowledgement.

Student Feedback and Equity Concerns

The Gillings School has in place a mechanism for students to provide feedback, including specifically equity concerns and bias-related issues. You can use this form to describe feedback, both positive and negative, about anything including issues related to your experience as a student at Gillings, administrative processes, and classroom activities. This form will also allow you to specifically describe incidents in which racial or other equity-related bias, or microaggressions, occurred. You may submit this form anonymously. However, for us to follow up and provide the necessary support, we encourage you to include your contact information. For further information, please visit the Student Feedback and Equity Concerns FAQ. Please note that this form does not take the place of any University process or policy. If you would like to report an incident under the University’s policy on Prohibited Discrimination, Harassment, and Related Misconduct including Sexual and Gender Based Harassment, Sexual Violence, Interpersonal Violence, and Stalking, please visit Safe At UNC or the Equal Opportunity and Compliance Office (EOC) for additional information, including resources, contact, and reporting options.

Technical support

The best way to help prevent technical issues from causing problems for assignments is to submit them at least 24-36 hours before the due date and time. Your instructor cannot resolve technical issues, but it’s important to notify them if you are experiencing issues. If you have problems submitting an assignment or taking a quiz in Sakai, immediately do the following: 1. Contact the UNC Information Technology Services (ITS) department or Bios IT with the time you attempted to do your course action and what the course action was. 2. Email your instructor with the information you sent to ITS/Bios IT and what time you sent the information.

The ITS department provides technical support 24-hours per day, seven days per week. If you need computer help, please contact the ITS Help Desk by phone at +1-919-962-HELP (4357), or by online help request, or by UNC Live Chat.

Title IX at UNC Chapel Hill

Any student who is impacted by discrimination, harassment, interpersonal (relationship) violence, sexual violence, sexual exploitations, or stalking is encouraged to seek resources on campus or in the community. Please contact the Director of Title IX Compliance / Title IX Coordinator (Adrienne Allison), Report and Response Coordinators in the Equal Opportunity and Compliance Office, Counseling and Psychological Services (confidential), or the Gender Violence Services Coordinators (confidential) to discuss your specific needs. Additional resources are available at the “Safe at UNC” website.


This page was last updated on 01/10/2024.