#### Machine Learning with Python & Statistics

4 (4,001 Ratings)

220 Learners

#### Webinars

More webinars
##### Visualization with Python Pandas

Suraj Jain

10 months ago

Visualization in Python Pandas
In this article, I will try to take you through some of the basic and most used plots in python pandas.

### Basic Plotting: plot

Plot() method in pandas make plots of DataFrame and Series using matplotlib / pylab.
``````import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10,4),index=pd.date_range('1/1/2000',
periods=10), columns=list('ABCD'))
df.plot()
``````
Its the output is as follows –
Output Plotting
If the index consists of dates, it calls gct().autofmt_xdate() to format the x-axis as shown in the above illustration.
We can plot one column versus another using the x and y keywords.
Plotting methods allow a handful of plot styles other than the default line plot. These methods can be provided as the kind keyword argument to plot().
These include −
• bar or barh for bar plots
• hist for histogram
• box for boxplot
• 'area' for area plots
• 'scatter' for scatter plot
• Pie Chart

## Bar Plot

What is a Bar Plot?
A barplot (or barchart) is one of the most common types of graphic. It shows the relationship between a numeric and a categoric variable. Each entity of the categoric variable is represented as a bar. The size of the bar represents its numeric value.
Let’s visualize it
A bar plot can be created in the following way −
``````import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.bar()``````
Its output is as follows –
Output Bar Plot
To produce a stacked bar plot, we have to provide parameter stacked=true −
``````import pandas as pd
df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.bar(stacked=true)``````
Its the output is as follows –
stacked bar plot
Now if we want to get the horizontal bar plots, we will use the bar method −
``````import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10,4),columns=['a','b','c','d')
df.plot.barh(stacked=true)
``````
Its output is as follows –
horizontal bar plots

## Histograms

A histogram is a representation of the distribution of numerical data, where the data are binned and the count for each bin is represented.
Histograms can be plotted using the plot.hist() method. We can specify the number of bins.
``````import pandas as pd
import numpy as np

df = pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])

df.plot.hist(bins=20)
``````
Its output is as follows –
Histograms
To plot different histograms for each column, use the following code −
``````import pandas as pd
import numpy as np

df=pd.DataFrame({'a':np.random.randn(1000)+1,'b':np.random.randn(1000),'c':
np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])

df.diff.hist(bins=20)
``````
Its output is as follows –
histograms for each column

## Box Plots

Boxplot is used to visualize the distribution of values within each column.
It can be drawn calling Series.box.plot() and DataFrame.box.plot(), or DataFrame.boxplot() .
So, let’s visualize, here is a boxplot representing five trials of 10 observations of a uniform random variable on [0,1).
``````import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
df.plot.box()
``````
Its output is as follows –
Box Plots

## Area Plot

Area plot can be created using the Series.plot.area() or the DataFrame.plot.area() methods.
``````import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])
df.plot.area()
``````
Its output is as follows –
Area Plot

## Scatter Plot

Scatter the plot can be plot using the DataFrame.plot.scatter() methods.
``````import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(50, 4), columns=['a', 'b', 'c', 'd'])
df.plot.scatter(x='a', y='b')
``````
Its the output is as follows –
Scatter Plot

## Pie Chart

Pie the chart can be plot using the DataFrame.plot.pie() method.
``````import pandas as pd
import numpy as np

df = pd.DataFrame(3 * np.random.rand(4), index=['a', 'b', 'c', 'd'], columns=['x'])
df.plot.pie(subplots=true)
``````
Its output is as follows −
Pie Chart