DataFrame Boolean Selection으로 데이터 선택하기

학습목표

  1. Dataframe boolean selection 이해하기
import pandas as pd
# data 출처: https://www.kaggle.com/hesh97/titanicdataset-traincsv/data
train_data = pd.read_csv('./train.csv')
train_data.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S

boolean selection으로 row 선택하기

  • numpy에서와 동일한 방식으로 해당 조건에 맞는 row만 선택

30대이면서 1등석에 탄 사람 선택하기

class_ = train_data['Pclass'] == 1
age_ = (train_data['Age'] >= 30) & (train_data['Age'] < 40)

train_data[class_ & age_]
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
61 62 1 1 Icard, Miss. Amelie female 38.0 0 0 113572 80.0000 B28 NaN
137 138 0 1 Futrelle, Mr. Jacques Heath male 37.0 1 0 113803 53.1000 C123 S
215 216 1 1 Newell, Miss. Madeleine female 31.0 1 0 35273 113.2750 D36 C
218 219 1 1 Bazzani, Miss. Albina female 32.0 0 0 11813 76.2917 D15 C
224 225 1 1 Hoyt, Mr. Frederick Maxfield male 38.0 1 0 19943 90.0000 C93 S
230 231 1 1 Harris, Mrs. Henry Birkhardt (Irene Wallach) female 35.0 1 0 36973 83.4750 C83 S
248 249 1 1 Beckwith, Mr. Richard Leonard male 37.0 1 1 11751 52.5542 D35 S
257 258 1 1 Cherry, Miss. Gladys female 30.0 0 0 110152 86.5000 B77 S
258 259 1 1 Ward, Miss. Anna female 35.0 0 0 PC 17755 512.3292 NaN C
269 270 1 1 Bissette, Miss. Amelia female 35.0 0 0 PC 17760 135.6333 C99 S
273 274 0 1 Natsch, Mr. Charles H male 37.0 0 1 PC 17596 29.7000 C118 C
309 310 1 1 Francatelli, Miss. Laura Mabel female 30.0 0 0 PC 17485 56.9292 E36 C
318 319 1 1 Wick, Miss. Mary Natalie female 31.0 0 2 36928 164.8667 C7 S
325 326 1 1 Young, Miss. Marie Grice female 36.0 0 0 PC 17760 135.6333 C32 C
332 333 0 1 Graham, Mr. George Edward male 38.0 0 1 PC 17582 153.4625 C91 S
383 384 1 1 Holverson, Mrs. Alexander Oskar (Mary Aline To... female 35.0 1 0 113789 52.0000 NaN S
390 391 1 1 Carter, Mr. William Ernest male 36.0 1 2 113760 120.0000 B96 B98 S
412 413 1 1 Minahan, Miss. Daisy E female 33.0 1 0 19928 90.0000 C78 Q
447 448 1 1 Seward, Mr. Frederic Kimber male 34.0 0 0 113794 26.5500 NaN S
452 453 0 1 Foreman, Mr. Benjamin Laventall male 30.0 0 0 113051 27.7500 C111 C
486 487 1 1 Hoyt, Mrs. Frederick Maxfield (Jane Anne Forby) female 35.0 1 0 19943 90.0000 C93 S
512 513 1 1 McGough, Mr. James Robert male 36.0 0 0 PC 17473 26.2875 E25 S
520 521 1 1 Perreault, Miss. Anne female 30.0 0 0 12749 93.5000 B73 S
537 538 1 1 LeRoy, Miss. Bertha female 30.0 0 0 PC 17761 106.4250 NaN C
540 541 1 1 Crosby, Miss. Harriet R female 36.0 0 2 WE/P 5735 71.0000 B22 S
558 559 1 1 Taussig, Mrs. Emil (Tillie Mandelbaum) female 39.0 1 1 110413 79.6500 E67 S
572 573 1 1 Flynn, Mr. John Irwin ("Irving") male 36.0 0 0 PC 17474 26.3875 E25 S
577 578 1 1 Silvey, Mrs. William Baird (Alice Munger) female 39.0 1 0 13507 55.9000 E44 S
581 582 1 1 Thayer, Mrs. John Borland (Marian Longstreth M... female 39.0 1 1 17421 110.8833 C68 C
583 584 0 1 Ross, Mr. John Hugo male 36.0 0 0 13049 40.1250 A10 C
604 605 1 1 Homer, Mr. Harry ("Mr E Haven") male 35.0 0 0 111426 26.5500 NaN C
632 633 1 1 Stahelin-Maeglin, Dr. Max male 32.0 0 0 13214 30.5000 B50 C
671 672 0 1 Davidson, Mr. Thornton male 31.0 1 0 F.C. 12750 52.0000 B71 S
679 680 1 1 Cardeza, Mr. Thomas Drake Martinez male 36.0 0 1 PC 17755 512.3292 B51 B53 B55 C
690 691 1 1 Dick, Mr. Albert Adrian male 31.0 1 0 17474 57.0000 B20 S
701 702 1 1 Silverthorne, Mr. Spencer Victor male 35.0 0 0 PC 17475 26.2875 E24 S
716 717 1 1 Endres, Miss. Caroline Louise female 38.0 0 0 PC 17757 227.5250 C45 C
737 738 1 1 Lesurer, Mr. Gustave J male 35.0 0 0 PC 17755 512.3292 B101 C
741 742 0 1 Cavendish, Mr. Tyrell William male 36.0 1 0 19877 78.8500 C46 S
759 760 1 1 Rothes, the Countess. of (Lucy Noel Martha Dye... female 33.0 0 0 110152 86.5000 B77 S
763 764 1 1 Carter, Mrs. William Ernest (Lucile Polk) female 36.0 1 2 113760 120.0000 B96 B98 S
806 807 0 1 Andrews, Mr. Thomas Jr male 39.0 0 0 112050 0.0000 A36 S
809 810 1 1 Chambers, Mrs. Norman Campbell (Bertha Griggs) female 33.0 1 0 113806 53.1000 E8 S
822 823 0 1 Reuchlin, Jonkheer. John George male 38.0 0 0 19972 0.0000 NaN S
835 836 1 1 Compton, Miss. Sara Rebecca female 39.0 1 1 PC 17756 83.1583 E49 C
842 843 1 1 Serepeca, Miss. Augusta female 30.0 0 0 113798 31.0000 NaN C
867 868 0 1 Roebling, Mr. Washington Augustus II male 31.0 0 0 PC 17590 50.4958 A24 S
872 873 0 1 Carlsson, Mr. Frans Olof male 33.0 0 0 695 5.0000 B51 B53 B55 S

남자이면서 1등석에 탄 사람 선택하기

class_ = train_data['Pclass'] == 1
age_ = train_data['Sex'] == 'male'

train_data[class_ & age_]
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
6 7 0 1 McCarthy, Mr. Timothy J male 54.0 0 0 17463 51.8625 E46 S
23 24 1 1 Sloper, Mr. William Thompson male 28.0 0 0 113788 35.5000 A6 S
27 28 0 1 Fortune, Mr. Charles Alexander male 19.0 3 2 19950 263.0000 C23 C25 C27 S
30 31 0 1 Uruchurtu, Don. Manuel E male 40.0 0 0 PC 17601 27.7208 NaN C
34 35 0 1 Meyer, Mr. Edgar Joseph male 28.0 1 0 PC 17604 82.1708 NaN C
... ... ... ... ... ... ... ... ... ... ... ... ...
839 840 1 1 Marechal, Mr. Pierre male NaN 0 0 11774 29.7000 C47 C
857 858 1 1 Daly, Mr. Peter Denis male 51.0 0 0 113055 26.5500 E17 S
867 868 0 1 Roebling, Mr. Washington Augustus II male 31.0 0 0 PC 17590 50.4958 A24 S
872 873 0 1 Carlsson, Mr. Frans Olof male 33.0 0 0 695 5.0000 B51 B53 B55 S
889 890 1 1 Behr, Mr. Karl Howell male 26.0 0 0 111369 30.0000 C148 C

122 rows × 12 columns