This notebook is a part of Pandas – Tips and Tricks mini-series, focusing on different aspects of pandas library in Python. In the below examples we will be looking at selecting the data by using .loc and .iloc methods. The notebook is also available on GitHub.

.loc: is primarily label based indexing.

.iloc: is primarily integer position based indexing.

Previous blog posts on the topic: Data import with Python, using pandas DataFrame – Part 1

Let’s start by importing pandas and loading the data:

In [1]:

# Loading the library
import pandas as pd

# I am using the data from WHO as an example
df = pd.read_csv('Data/SuicBoth.csv')

# Checking the DataFrame shape
print(df.shape)

# Checking the imported data
df.head()

(183, 6)

Out[1]:

	Country	Sex	2015	2010	2005	2000
0	Afghanistan	Both sexes	5.5	5.2	5.4	4.8
1	Albania	Both sexes	4.3	5.3	6.3	6.0
2	Algeria	Both sexes	3.1	3.4	3.6	3.0
3	Angola	Both sexes	20.5	20.7	20.0	18.4
4	Antigua and Barbuda	Both sexes	0.0	0.2	1.6	2.3

.loc and .iloc are used with square brackets, in a format df.loc[[row], [columns]]. Let’s look at few examples below:

In [2]:

# To access any row we can call it by the row label
df.loc[2]

Out[2]:

Country       Algeria
Sex        Both sexes
2015              3.1
2010              3.4
2005              3.6
2000                3
Name: 2, dtype: object

In [3]:

# or by its index. In these two cases the label is matching the index,
# so the results are the same
df.iloc[2]

Out[3]:

Country       Algeria
Sex        Both sexes
2015              3.1
2010              3.4
2005              3.6
2000                3
Name: 2, dtype: object

In [4]:

# However, we won't be able to call df.loc[-1] to access the last element,
# it doesn't exist in row labels, only in index based
df.iloc[-1]

Out[4]:

Country      Zimbabwe
Sex        Both sexes
2015             10.5
2010             11.2
2005             11.3
2000             12.1
Name: 182, dtype: object

In [5]:

# Multiple rows can be selected
df.loc[[3, 15, 146, 118]]

Out[5]:

	Country	Sex	2015	2010	2005	2000
3	Angola	Both sexes	20.5	20.7	20.0	18.4
15	Belgium	Both sexes	20.5	20.3	20.5	22.6
146	Slovenia	Both sexes	21.4	20.6	26.5	31.8
118	Nigeria	Both sexes	9.9	9.8	9.5	9.9

In [6]:

# The rows can be also selected by the slicing syntax - :
df.loc[14:20]

Out[6]:

	Country	Sex	2015	2010	2005	2000
14	Belarus	Both sexes	22.8	27.9	36.1	41.9
15	Belgium	Both sexes	20.5	20.3	20.5	22.6
16	Belize	Both sexes	7.3	6.1	5.1	8.0
17	Benin	Both sexes	9.4	9.0	9.0	8.5
18	Bhutan	Both sexes	11.7	11.4	12.2	13.0
19	Bolivia (Plurinational State of)	Both sexes	18.7	20.5	22.2	23.5
20	Bosnia and Herzegovina	Both sexes	6.0	6.0	8.0	9.8

Let’s look at some examples regarding the columns:

In [7]:

# Using the colon to select all the rows, and specifying
# the columns in the square bracket
df_new = df.loc[:, ['2015', '2010']]
df_new.head()

Out[7]:

	2015	2010
0	5.5	5.2
1	4.3	5.3
2	3.1	3.4
3	20.5	20.7
4	0.0	0.2

In [8]:

# The same result achieved by using .iloc
df_new = df.iloc[:, [2, 3]]
df_new.head()

Out[8]:

	2015	2010
0	5.5	5.2
1	4.3	5.3
2	3.1	3.4
3	20.5	20.7
4	0.0	0.2

In [9]:

# Range function can be also used.
# Selecting the range from 2 to 3
df_new = df.iloc[:, list(range(2, 4))]
df_new.head()

Out[9]:

	2015	2010
0	5.5	5.2
1	4.3	5.3
2	3.1	3.4
3	20.5	20.7
4	0.0	0.2

In [10]:

# Multiple colons can be applied as in the example below.
# Here, we include all the rows, and every other column
df_new = df.iloc[:, ::2]
df_new.head()

Out[10]:

	Country	2015	2005
0	Afghanistan	5.5	5.4
1	Albania	4.3	6.3
2	Algeria	3.1	3.6
3	Angola	20.5	20.0
4	Antigua and Barbuda	0.0	1.6

Downloadable content:
SuicBoth.csv

codeWithMax

codeWithMax

Pandas – Tips and Tricks – df.loc, df.iloc

Statistical Terms in Data Science and Regression Metrics

Basic Example of a Neural Network with TensorFlow and Keras

Leave a Reply Cancel reply

codeWithMax

Pandas – Tips and Tricks – df.loc, df.iloc

Statistical Terms in Data Science and Regression Metrics

Basic Example of a Neural Network with TensorFlow and Keras

Leave a Reply Cancel reply

Related posts

Plotting Error Bars in Python using Matplotlib and Numpy Random

Basic Image Recognition with Built-in Models in Keras

Basic Example of a Neural Network with TensorFlow and Keras

Statistical Terms in Data Science and Regression Metrics