This notebook is a part of Pandas – Tips and Tricks mini-series, focusing on different aspects of pandas library in Python. In the below examples we will be looking at selecting the data by using .loc and .iloc methods. The notebook is also available on GitHub.
.loc: is primarily label based indexing.
.iloc: is primarily integer position based indexing.
Previous blog posts on the topic: Data import with Python, using pandas DataFrame – Part 1
Let’s start by importing pandas and loading the data:
In [1]:
# Loading the library
import pandas as pd
# I am using the data from WHO as an example
df = pd.read_csv('Data/SuicBoth.csv')
# Checking the DataFrame shape
print(df.shape)
# Checking the imported data
df.head()
Out[1]:
.loc and .iloc are used with square brackets, in a format df.loc[[row], [columns]]. Let’s look at few examples below:
In [2]:
# To access any row we can call it by the row label
df.loc[2]
Out[2]:
In [3]:
# or by its index. In these two cases the label is matching the index,
# so the results are the same
df.iloc[2]
Out[3]:
In [4]:
# However, we won't be able to call df.loc[-1] to access the last element,
# it doesn't exist in row labels, only in index based
df.iloc[-1]
Out[4]:
In [5]:
# Multiple rows can be selected
df.loc[[3, 15, 146, 118]]
Out[5]:
In [6]:
# The rows can be also selected by the slicing syntax - :
df.loc[14:20]
Out[6]:
Let’s look at some examples regarding the columns:
In [7]:
# Using the colon to select all the rows, and specifying
# the columns in the square bracket
df_new = df.loc[:, ['2015', '2010']]
df_new.head()
Out[7]:
In [8]:
# The same result achieved by using .iloc
df_new = df.iloc[:, [2, 3]]
df_new.head()
Out[8]:
In [9]:
# Range function can be also used.
# Selecting the range from 2 to 3
df_new = df.iloc[:, list(range(2, 4))]
df_new.head()
Out[9]:
In [10]:
# Multiple colons can be applied as in the example below.
# Here, we include all the rows, and every other column
df_new = df.iloc[:, ::2]
df_new.head()
Out[10]:
Downloadable content: SuicBoth.csv