DataFrame에 column(컬럼) 추가,삭제하기

학습목표

  1. Dataframe에 새로운 colum을 추가하기
  2. Dataframe에 column 삭제하기
import pandas as pd
# data 출처: https://www.kaggle.com/hesh97/titanicdataset-traincsv/data
train_data = pd.read_csv('./train.csv')
train_data.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S

새 column 추가하기

  • [] 사용하여 추가하기
  • insert 함수 사용하여 원하는 위치에 추가하기
train_data['Age_double'] = train_data['Age'] * 2 #Age_double 칼럼 생성시킴
train_data.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked Age_double
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 44.0
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 76.0
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 52.0
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 70.0
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 70.0
train_data['Age_tripple'] = train_data['Age_double'] + train_data['Age']
train_data.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked Age_double Age_tripple
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 44.0 66.0
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 76.0 114.0
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 52.0 78.0
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 70.0 105.0
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 70.0 105.0
train_data.insert(3, 'Fare10', train_data['Fare'] / 10) #insert으로 3 index 에 Fare10 컬럼 만들어짐 
train_data.head()
PassengerId Survived Pclass Fare10 Name Sex Age SibSp Parch Ticket Fare Cabin Embarked Age_double Age_tripple
0 1 0 3 0.72500 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 44.0 66.0
1 2 1 1 7.12833 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 76.0 114.0
2 3 1 3 0.79250 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 52.0 78.0
3 4 1 1 5.31000 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 70.0 105.0
4 5 0 3 0.80500 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 70.0 105.0

column 삭제하기

  • drop 함수 사용하여 삭제
    • 리스트를 사용하여 멀티플 삭제 가능
train_data.drop('Age_tripple', axis=1) #원본데이터는 남아있음
train_data.head()
PassengerId Survived Pclass Fare10 Name Sex Age SibSp Parch Ticket Fare Cabin Embarked Age_double Age_tripple
0 1 0 3 0.72500 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 44.0 66.0
1 2 1 1 7.12833 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 76.0 114.0
2 3 1 3 0.79250 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 52.0 78.0
3 4 1 1 5.31000 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 70.0 105.0
4 5 0 3 0.80500 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 70.0 105.0
train_data.drop(['Age_double', 'Age_tripple'], axis=1) #axis: index와 column 중에서 어디에서 삭제를 할지 결정합니다.axis =0 이면 index에서 삭제를 하고 1이면 column에서 삭제를 합니다 .
train_data.head()
PassengerId Survived Pclass Fare10 Name Sex Age SibSp Parch Ticket Fare Cabin Embarked Age_double Age_tripple
0 1 0 3 0.72500 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 44.0 66.0
1 2 1 1 7.12833 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 76.0 114.0
2 3 1 3 0.79250 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 52.0 78.0
3 4 1 1 5.31000 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 70.0 105.0
4 5 0 3 0.80500 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 70.0 105.0
train_data.drop(['Age_double', 'Age_tripple'], axis=1, inplace=True)
train_data
PassengerId Survived Pclass Fare10 Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 1 0 3 0.72500 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
1 2 1 1 7.12833 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C
2 3 1 3 0.79250 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
3 4 1 1 5.31000 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
4 5 0 3 0.80500 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S
... ... ... ... ... ... ... ... ... ... ... ... ... ...
886 887 0 2 1.30000 Montvila, Rev. Juozas male 27.0 0 0 211536 13.0000 NaN S
887 888 1 1 3.00000 Graham, Miss. Margaret Edith female 19.0 0 0 112053 30.0000 B42 S
888 889 0 3 2.34500 Johnston, Miss. Catherine Helen "Carrie" female NaN 1 2 W./C. 6607 23.4500 NaN S
889 890 1 1 3.00000 Behr, Mr. Karl Howell male 26.0 0 0 111369 30.0000 C148 C
890 891 0 3 0.77500 Dooley, Mr. Patrick male 32.0 0 0 370376 7.7500 NaN Q

891 rows × 13 columns