Browsing Tag



d3.js Data Visualization; New Mexico

Visualization is the key to understanding the data. Lines of endless data can be overwhelming. Most of the times, to reach the conclusion, it is better to visualize the task. D3.js is a JavaScript library that is used for data visualization using the web standards like HTML, CSS, and SVG. Since the initial release in 2011, D3 gained popularity pretty quickly due to dynamic, interactive data visualizations capabilities in the browser. There is a huge community of followers of D3, and it is supported and improved by its creator Mike Bostock constantly.

Lately I’ve been working on D3. There is definitely a lot to learn, a lot to gain and improve. This blog post is a quick summary of knowledge that I gained in the last few weeks playing with D3. Initially I started looking for tutorials, and books on the subject. To be honest I’ve got a bit frustrated due to the lack of detailed material on the subject. I found “D3.js Essential Training for Data Scientists” on that guided me in the right direction. I discovered for myself, and since then I started checking different types of visualization provided on that website. My way of learning is to think about a problem/case that might interest me, see if there is any solution available, if not then divide the problem into smaller tasks, and work on them one by one.

Let’s look at few examples below. I am working on New Mexico data for this blog post. The data was gathered through US Census Bureau, and US Department of Labor websites. Disclaimer: this is not a complete analysis of the data; data is used for visualization purposes only, all the data is available for public on the above mentioned websites. I will be working on a few tutorials in the next few weeks, explaining the below examples in bigger detail.


Data visualization with Python and Matplotlib / Scatter Plot – Part 3

Let’s look at what else can be done with the data from Part 1. We start with the csv import mentioned in Part 2. Looking at the data we see that we have data available for 2010 and 2015 years, and we can analyze the change in suicide rates.

In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
In [2]:
table = pd.read_excel('mergedData.xlsx')
Country 2015_s 2010_s 2015_p 2013_p 2010_p 2013_d suiAve suiPerDeath deaPerPop
0 Afghanistan 5.5 5.2 32526.6 30682.5 27962.2 7.7 5.35 0.694805 0.77
1 Albania 4.3 5.3 2896.7 2883.3 2901.9 9.4 4.80 0.510638 0.94
2 Algeria 3.1 3.4 39666.5 38186.1 36036.2 5.7 3.25 0.570175 0.57
3 Angola 20.5 20.7 25022.0 23448.2 21220.0 13.9 20.60 1.482014 1.39
4 Antigua and Barbuda 0.0 0.2 91.8 90.0 87.2 6.8 0.10 0.014706 0.68

Importing CSV data in Python

One of the purposes of this blog, as it is stated in the About page, is to share useful information while I am practicing to code. I think it is also a good habit to have posts like this one to refresh my own memory. Data comes in many forms (including CSV – Comma-separated values), and it needs to be imported for further manipulations and analyses.