Visualization with Python Pandas

Suraj Jain

a year ago

Visualization in Python Pandas
Visualization in Python Pandas
In this article, I will try to take you through some of the basic and most used plots in python pandas.

Basic Plotting: plot

Plot() method in pandas make plots of DataFrame and Series using matplotlib / pylab.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10,4),index=pd.date_range('1/1/2000',
   periods=10), columns=list('ABCD'))
df.plot()
Its the output is as follows –
Output Plotting
Output Plotting
If the index consists of dates, it calls gct().autofmt_xdate() to format the x-axis as shown in the above illustration.
We can plot one column versus another using the x and y keywords.
Plotting methods allow a handful of plot styles other than the default line plot. These methods can be provided as the kind keyword argument to plot().
These include −
  • bar or barh for bar plots
  • hist for histogram
  • box for boxplot
  • 'area' for area plots
  • 'scatter' for scatter plot
  • Pie Chart

Bar Plot

What is a Bar Plot?
A barplot (or barchart) is one of the most common types of graphic. It shows the relationship between a numeric and a categoric variable. Each entity of the categoric variable is represented as a bar. The size of the bar represents its numeric value.
Let’s visualize it
A bar plot can be created in the following way −
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.bar()
Its output is as follows –
Output Bar Plot
Output Bar Plot
To produce a stacked bar plot, we have to provide parameter stacked=true −
import pandas as pd
df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.bar(stacked=true)
Its the output is as follows –
stacked bar plot
stacked bar plot
Now if we want to get the horizontal bar plots, we will use the bar method −
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.barh(stacked=true)
Its output is as follows –
horizontal bar plots
horizontal bar plots

Histograms

 A histogram is a representation of the distribution of numerical data, where the data are binned and the count for each bin is represented.
Histograms can be plotted using the plot.hist() method. We can specify the number of bins.
import pandas as pd
import numpy as np

df = pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])

df.plot.hist(bins=20)
Its output is as follows –
Histograms
Histograms
To plot different histograms for each column, use the following code −
import pandas as pd
import numpy as np

df=pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])

df.diff.hist(bins=20)
Its output is as follows –
histograms for each column
histograms for each column

Box Plots

Boxplot is used to visualize the distribution of values within each column.
It can be drawn calling Series.box.plot() and DataFrame.box.plot(), or DataFrame.boxplot() .
So, let’s visualize, here is a boxplot representing five trials of 10 observations of a uniform random variable on [0,1).
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
df.plot.box()
Its output is as follows –
Box Plots
Box Plots

Area Plot

Area plot can be created using the Series.plot.area() or the DataFrame.plot.area() methods.
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])
df.plot.area()
Its output is as follows –
Area Plot
Area Plot

Scatter Plot

Scatter the plot can be plot using the DataFrame.plot.scatter() methods.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(50, 4), columns=['a', 'b', 'c', 'd'])
df.plot.scatter(x='a', y='b')
Its the output is as follows –
Scatter Plot
Scatter Plot

Pie Chart

Pie the chart can be plot using the DataFrame.plot.pie() method.
import pandas as pd
import numpy as np

df = pd.DataFrame(3 * np.random.rand(4), index=['a', 'b', 'c', 'd'], columns=['x'])
df.plot.pie(subplots=true)
Its output is as follows −
Pie Chart
Pie Chart
I hope you enjoyed reading this article and finally, you came to know about Visualization with Python Pandas.
For more such blogs/courses on data science, machine learning, artificial intelligence and emerging new technologies do visit us at InsideAIML.
Thanks for reading…
Happy Learning…

Submit Review

We're Online!

Chat now for any query