1
2
3
4
5
#

Name:

Description:

A simulated data set containing information on ten thousand customers. The aim here is to predict which customers will default on their credit card debt.

Variables:

A data frame with 10000 observations on the following 4 variables.

ID

Identification

Income

Income in $10,000's

Limit

Credit limit

Rating

Credit rating

Cards

Number of credit cards

Age

Age in years

Education

Number of years of education

Gender

A factor with levels Male and Female

Student

A factor with levels No and Yes indicating whether the individual was a student

Married

A factor with levels No and Yes indicating whether the individual was married

Ethnicity

A factor with levels African American, Asian, and Caucasian indicating the individual's ethnicity

Balance

Average credit card balance in $.

Link To Google Sheets:

Rows:

Columns:

License Type:

References/Notes/Attributions:

Source

Simulated data, with thanks to Albert Kim for pointing out that this was omitted, and supplying the data and man documentation page on Oct 19, 2017

References

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013) An Introduction to Statistical Learning with applications in R, www.StatLearning.com, Springer-Verlag, New York

R Dataset Upload:

Use the following R code to directly access this dataset in R.

d <- read.csv("https://www.key2stats.com/Credit_Card_Balance_Data_967_35.csv")

R Coding Interface:


Datasets Tag Questions & Instructional Blocks

NumberContentType
No results found.