Cyclistic Bike Share

🔗 GitHub

Project Overview

Created a MySQL script to upload and pre-process the data, as well as build several pivot tables to help leadership better understand customer segments.
Created an R programming language script to upload and pre-process the data, build several pivot tables, and create visualizations to better present data to leadership.
Created a Tableau Storyboard to present findings to leadership, including several recommendations.
Wrote a series of Medium articles to futher explain each stage of the data analysis process.

Code and Resources Used

R Version: 4.2.2

MySQL Version: 8.0.31

Data Cleaning

After uploading the data, I performed several pre-processing tasks to prepare it for analysis. I performed the following changes to the data:

Added

ride_length column by subtracting starting time from ending time
day_of_week column by WEEKDAY() function

Changed

Renamed all csv files to follow “Bike__Data_YYYYMM” nomenclature
Renamed “rideable_type” column to “bike_type”
Renamed “start_station_name” column to “start_sta_name”
Renamed “start_station_id” column to “start_sta_id”
Renamed “end_station_name” column to “end_sta_name”
Renamed “end_station_id” column to “end_sta_id”
Renamed “member_casual” column to “user_type” column
Renamed “rideable_type” column to “bike_type” column
“started_at” column’s data type changed to date and time
“ended_at” column’s data type changed to date and time
Replaced “Clybourne Ave” to “Clybourn Ave” in all columns
Applied the CamelCase naming convention to the data set

Removed

All null values in the data set
All trailing, leading, and excess spaces in the data set
All records with a ride length of less than one minute

Exploratory Data Analysis

I created several visualizations to better understand the data’s distribution, value differences, and categorical variables. Below are a few highlights from this phase:

sql_table_12 r_table_4 r_table_7 r_table_5

🔗 GitHub#

Project Overview#

Code and Resources Used#

Data Cleaning#

Added#

Changed#

Removed#

Exploratory Data Analysis#