Introduction to Data Science
This is a backup copy of the 2024 class - it is only for reference for DATA 201 students
Office Location:
Duke Hall #209
Phone: 909-748-8630
E-Mail: joanna_bieri@redlands.edu
(Email or Teams are my preferred contact methods)
Data Science Lab:
Wednesday 6-7:30pm Duke 200
Office Hours:
Click here for my schedule.
You can also email me for an appointment!
-
Link to our Canvas - for submitting work and checking grades:
-
Important Course Documents
Course Syllabus
Schedule of Topics - Updated 10/9/24
NOTE: as the semester progresses we may change up the schedule a bit to suit our class pace and interests. The most recent schedule will be posted here.
Final Projects
Cheat Sheet of Commands1
Cheat Sheet of Commands2
-
Daily Assignments - Reading - Handouts
-
Day 1 - Tuesday - 9/3 - Click Here
The overall goal of this class is to engage in curious exploration. Please be patient with yourself and others! Along the way you will learn how to make pretty graphs, some basics of programming in python, and how to think like a Data Scientist.
PRE-CLASS:Each class day will have content that you should explore before class. I build my lectures, labs, and discussions assuming that you have done the work before class. Each day you will submit what you have completed before class. It does not have to be perfect, but it must show your attempt to engage in the material.
CLASS TIME:
** If you are one to leave tabs open, remember to occasionally refresh the page!Each class will have daily in class content.
ANNOUNCEMENTS:
Notes - What is Data Science?
Slides - What is Data Science?
Notes (how to) - Set up your computer (important that this is done this week)
GitHub - Fork the Day1 files to your GitHub and then clone them to your local machine.
Start Prepping for Day 2 - click on the Day 2 link and complete the PRE-CLASS materials.
-
Day 2 - Thursday - 9/5 - Click Here
PRE-CLASS:Video: How to be Successful This video is to walk you through being a successful student in this class.
CLASS TIME:
Notes - Hello World!
Video: Hello World!
GitHub Link - Day2 files
Take the quiz:
Section 1
Section 2
** If you are one to leave tabs open, remember to occasionally refresh the page!Slides - Hello World - Our first data science exploration!
ANNOUNCEMENTS:
Time to work on homework
Get help with technology
Start Prepping for Day 3 - click on the Day 3 link and complete the PRE-CLASS materials.
-
Day 3 - Tuesday - 9/10 - Click Here
PRE-CLASS:Notes - Data and Visualization
CLASS TIME:
Video - Data and Visualization
GitHub Link - Day3 files
Take the quiz:
Section 1
Section 2
Slides - Data and Visualization
ANNOUNCEMENTS:
Homework questions
Extra in class Plots!
Get help with technology
Start Prepping for Day 4 - click on the Day 4 link and complete the PRE-CLASS materials.
-
Day 4 - Thursday - 9/12
PRE-CLASS:Notes - Numerical and Categorical Data
CLASS TIME:
Video - Numerical and Categorical Data
GitHub Link - Day4 files
Take the quiz:
Section 1
Section 2
Slides - Numerical and Categorical Data
ANNOUNCEMENTS:
Homework questions
Extra in class Plots!
No Class next week
Catch up on Class Work and Independent exploration time:
Once you are 100% caught up in the class, you should choose a book (or a series of articles) to read about issues, ethics, or impacts of data science. (see my examples in class or search for your own sources online). Later in the class we will have discussions about ethics in data science - you will have a head start!
-
Day 5 - Tuesday - 9/24
PRE-CLASS:Notes - Data Wrangling
CLASS TIME:
Video - Data Wrangling
GitHub Link - Day5 files
Take the quiz:
Section 1
Section 2
Slides - Data Wrangling
ANNOUNCEMENTS:
Homework questions
In class - Data Wrangling Code Session.
Start Prepping for Day 6 - click on the Day 6 link and complete the PRE-CLASS materials.
-
Day 6 - Thursday - 9/26
PRE-CLASS:Notes - Data Wrangling Continued
CLASS TIME:
Video - Data Wrangling Continued
GitHub Link - Day6 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Data Wrangling Continued
ANNOUNCEMENTS:
Homework questions
In class - Work on the Exploration Together.
Start Prepping for Day 7 - click on the Day 7 link and complete the PRE-CLASS materials.
-
Day 7 - Tuesday - 10/1
PRE-CLASS:Notes - Data Wrangling Joins and Merges
CLASS TIME:
Video - Data Wrangling Joins and Merges
GitHub Link - Day7 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Joins and Merges
ANNOUNCEMENTS:
Homework questions
Start working on the Practice Exam.
Start Prepping for Day 8 - click on the Day 8 link and complete the PRE-CLASS materials.
-
Day 8 - Thursday - 10/3
PRE-CLASS:Notes - Reading in Data and Data Types - fixing Data Errors
CLASS TIME:
Video - Reading in Data and Data Types - fixing Data Errors
GitHub Link - Day8 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Reading in Data and Data Types
ANNOUNCEMENTS:
Homework questions
Talk about the Final Projects
Talk about Exam 1 - Cheat Sheet of Commands
Fall Break - No Class Tuesday 10/8
Exam 1 - you must work on the exam and submit it before class on Thursday 10/10 - click on the Exam 1 link for more information.
-
Exam 1 - Thursday- 10/10
PRE-CLASS:- Work on Exam1 - after it becomes available on Canvas
CLASS TIME:
- This exam is open notes, open computer, etc --- but you cannot use any type of human or AI intervention. This includes asking how to do something online, aksing ChatGPT, or working with friends.
- The pre-class work on the exam must be COMPLETELY YOUR OWN!
Submit the Exam - First Draft
Exam 1 - Section 1 (1:15pm)
Exam 1 - Section 2 (2:40pm)
After you submit the exam, you will be able to work in groups on the exam during our class time. You can then go home and work individually on finishing up your best possible work on the exam. Submit your final version of the exam as an additional attempt.Work on Exam 1 in Groups
ANNOUNCEMENTS:Start Prepping for Day 9 - click on the Day 9 link and complete the PRE-CLASS materials.
-
Day 9 - Tuesday - 10/15
PRE-CLASS:Notes - Importing, Re-coding, and Visualizing Data
CLASS TIME:
Video - Importing, Re-coding, and Visualizing Data
GitHub Link - Day9 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Importing, Re-coding, and Visualizing Data
ANNOUNCEMENTS:
Homework questions
Start Prepping for Day 10 - click on the Day 10 link and complete the PRE-CLASS materials.
Get your Data Science Ethics Reading Materials!
-
Day 10 - Thursday - 10/17
PRE-CLASS:Notes - Effective Visualization and Data Storytelling
CLASS TIME:
Video - Effective Visualization and Data Storytelling
GitHub Link - Day10 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Effective Visualization and Data Storytelling
ANNOUNCEMENTS:
Homework questions
Start Prepping for Day 11 - click on the Day 11 link and complete the PRE-CLASS materials.
Get your Data Science Ethics Reading Materials!
-
Day 11 - Tuesday - 10/22
PRE-CLASS:Notes - Getting Data and Simpsons Paradox
CLASS TIME:
Video - Getting Data and Simpsons Paradox
GitHub Link - Day11 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Getting Data and Simpsons Paradox
ANNOUNCEMENTS:
Homework questions
Start Prepping for Day 12 - click on the Day 12 link and complete the PRE-CLASS materials.
You should be reading your Data Ethics Book!
-
Day 12 - Thursday - 10/24
PRE-CLASS:Helpful lecture and video for preparing your final project proposal (due 10/31)
CLASS TIME:
Notes - Doing Data Science
Video - Doing Data Science
Day 12 - Webscraping!
Notes - Web Scraping
Video - Web Scraping
GitHub Link - Day12 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Web Scraping
ANNOUNCEMENTS:
Homework questions
Start Prepping for Day 13 - click on the Day 13 link and complete the PRE-CLASS materials.
Come to class next week prepared to discuss your reading with your classmates and professor!
LOTS OF POSSIBLE ARTICLES - DATA ETHICS
The Ethics of Managing People's Data
Big data ethics and 10 controversial experiments
Introduction to Data Ethics
The Importance of Ethical Data Collection
Ethics of Artificial Intelligence
Is Ethical AI even possible
Dealing with bias in AI
Who is making sure the AI machines aren't racist?
Algorithmic Bias in Healthcare
Health Care Bias Is Dangerous. But So Are ‘Fairness’ Algorithms
Shedding light on AI bias with real world examples
Ethics In The Age Of Data: Navigating The Crossroads Of Privacy And Progress
The Ethics of Data Ownership: Who Owns Your Digital Identity?
Conducting Research with Tribal Communities: Sovereignty, Ethics, and Data-Sharing Issues
Ethical Research with Indigenous Peoples: Doing Right by Respecting Native Rights
Emancipatory Data Science: A Liberatory Framework for Mitigating Data Harms and Fostering Social Transformation
-
Day 13 - Tuesday - 10/29
PRE-CLASS:Notes - Data Ethics - Misrepresentation and Data Privacy
CLASS TIME:
Video - Data Ethics - Misrepresentation and Data Privacy
GitHub Link - Day13 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Data Ethics - Misrepresentation and Data Privacy
ANNOUNCEMENTS:
In Class Discussion
Start Prepping for Day 14 - click on the Day 14 link and complete the PRE-CLASS materials.
You should be reading/finishing your Data Ethics Book!
Exam 2 will cover:
- Loading data and doing basic analysis - shape, describe(), columns, look for weird data.
- The data in this exam will include NaNs
- Masking data to find specific information.
- Creating new columns in the data frame - doing a calculation using other columns.
- Using groupby(), value_counts(), and sort_values()
- YMerging two data frame into one data frame
- Visualizing data both recreating a given plot and coming up with a plot of your own.
- Communicating in written form what the results, data frames (tables) and visualizations (plots) mean in terms of the data.
-
Day 14 - Thursday - 10/31
PRE-CLASS:Notes - Data Ethics - Algorithmic Bias
CLASS TIME:
Video - Data Ethics - Algorithmic Bias
GitHub Link - Day14 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Data Ethics - Algorithmic Bias
ANNOUNCEMENTS:
In class Discussion
Work on Exam 2 - You should submit your work before you come to class where we will work on the exam in groups.
My solutions to Exam 1 are now posted on Canvas - you can see what kind of writing and explanation I expect
Start writing your Final Project Proposal - DUE THURSDAY 11/7/24 (updated to Tuesday 11/12/24)
-
Exam 2 - Thursday- 11/07
PRE-CLASS:- Work on Exam2 - after it becomes available on Canvas
CLASS TIME:
- This exam is open notes, open computer, etc --- but you cannot use any type of human or AI intervention. This includes asking how to do something online, asking ChatGPT, or working with friends.
- The pre-class work on the exam must be COMPLETELY YOUR OWN!
Submit the Exam - First Draft
Exam 2 - Section 1 (1:15pm)
Exam 2 - Section 2 (2:40pm)
After you submit the exam, you will be able to work in groups on the exam during our class time. You can then go home and work individually on finishing up your best possible work on the exam. Submit your final version of the exam as an additional attempt.
Cheat Sheet of Commands 1
Cheat Sheet of Commands 2
Video - Common Errors in Python
Video - Homework Day 11 - Solutions - live programming
Work on Exam 2 in Groups
ANNOUNCEMENTS:Start Prepping for Day 15 - click on the Day 15 link and complete the PRE-CLASS materials.
Finish writing your Final Project Proposal - DUE TUESDAY 11/12/24 - Here is info about the proposals Final Projects
-
Day 15 - Thursday - 11/14
PRE-CLASS:Notes - Introduction to Modeling and Algorithms
CLASS TIME:
Video - Modeling with Plotly Trendlines
Video - Modeling with Linear Regression
GitHub Link - Day15 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Introduction to Modeling and Algorithms
ANNOUNCEMENTS:
Work on the assignment in class
Start Prepping for Day 16 - click on the Day 16 link and complete the PRE-CLASS materials.
-
Day 16 - Tuesday - 11/19
PRE-CLASS:Notes - Modeling Nonlinear Relationships
CLASS TIME:
Video - Modeling Nonlinear Relationships
GitHub Link - Day16 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Modeling Nonlinear Relationships
ANNOUNCEMENTS:
Work on the assignment in class
Start Prepping for Day 17 - click on the Day 17 link and complete the PRE-CLASS materials.
Linear Regression Kaggle Search - probably good data for linear regressions
-
Day 17 - Thursday - 11/21
PRE-CLASS:Notes - Modeling with Multiple Predictors
CLASS TIME:
Video - Modeling with Multiple Predictors
GitHub Link - Day17 files
Take the quiz:
Section 1 (1:15pm)
Section 2 (2:40pm)
Slides - Modeling with Multiple Predictors
ANNOUNCEMENTS:
Work on the assignment in class
Bike Rental Image to Reproduce
Start Prepping for Day 18 - click on the Day 18 link and complete the PRE-CLASS materials.
-
Day 18 - Tuesday - 11/26 - no in person class meeting
PRE-CLASS:Notes - Classification and Categorical Data
CLASS TIME:
Video - Classification and Categorical Data
GitHub Link - Day18 files
No Quiz
No Class - please watch the video and try the homework. I will answer questions about the homework during our next in person class.
ANNOUNCEMENTS:You should be working on your Group Final Project. Come to class ready to have really productive group work time!
Each day I will post the lecture videos, homework, reading, and other information. Make sure to check here for each day of class.Tuesday 9/17 and Thursday 9/19 no class.
-
Day 1 - Tuesday - 9/3 - Click Here
-
Homework Solutions - Exam Review
All Practice Problems and Programming Assignment solutions are available on Canvas
Section 1 - 1:10-2:30pm
Section 2 - 2:40-3:55pm
Link to our GitHub - for getting assignments and version control
Check out Data Science in a Box - Intro to Data Science - using R.
Our course follows much of this content and thus is licensed under Creative Commons Attribution-ShareAlike 4.0 International.