To apply your own or another library’s functions to Pandas objects, you should be aware of the three important methods. The methods have been discussed below. The appropriate method to use depends on whether your function expects to operate on an entire DataFrame, row- or column-wise, or element-wise.
- Table wise Function Application: pipe()
- Row or Column Wise Function Application: apply()
- Element wise Function Application: applymap()
Table-wise Function Application
Custom operations can be performed by passing the function and the appropriate number of parameters as pipe arguments. Thus, operation is performed on the whole DataFrame.
For example, add a value of 2 to all the elements in the DataFrame. Then,
adder function
The adder function adds two numeric values as parameters and returns the sum.
def adder(ele1,ele2): return ele1+ele2
We will now use the custom function to conduct operations on the DataFrame.
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3']) df.pipe(adder,2)
Let’s see the full program −
import pandas as pd import numpy as np def adder(ele1,ele2): return ele1+ele2 df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3']) df.pipe(adder,2) print df.apply(np.mean)
Its output is as follows −
col1 col2 col3 0 2.176704 2.219691 1.509360 1 2.222378 2.422167 3.953921 2 2.241096 1.135424 2.696432 3 2.355763 0.376672 1.182570 4 2.308743 2.714767 2.130288
Row or Column Wise Function Application
Arbitrary functions can be applied along the axes of a DataFrame or Panel using the apply() method, which, like the descriptive statistics methods, takes an optional axis argument. By default, the operation performs column-wise, taking each column as an array-like.
Example 1
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3']) df.apply(np.mean) print df.apply(np.mean)
Its output is as follows −
col1 -0.288022 col2 1.044839 col3 -0.187009 dtype: float64
Bypassing axis parameter, operations can be performed row-wise.
Example 2
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3']) df.apply(np.mean,axis=1) print df.apply(np.mean)
Its output is as follows −
col1 0.034093 col2 -0.152672 col3 -0.229728 dtype: float64
Example 3
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3']) df.apply(lambda x: x.max() - x.min()) print df.apply(np.mean)
Its output is as follows −
col1 -0.167413 col2 -0.370495 col3 -0.707631 dtype: float64
Element Wise Function Application
Not all functions can be vectorized (neither the NumPy arrays which return another array nor any value), the methods applymap() on DataFrame and analogously map() on Series accept any Python function taking a single value and returning a single value.
Example 1
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3']) # My custom function df['col1'].map(lambda x:x*100) print df.apply(np.mean)
Its output is as follows −
col1 0.480742 col2 0.454185 col3 0.266563 dtype: float64
Example 2
import pandas as pd import numpy as np # My custom function df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3']) df.applymap(lambda x:x*100) print df.apply(np.mean)
Its output is as follows −
col1 0.395263 col2 0.204418 col3 -0.795188 dtype: float64