# An Introduction to 7 most Commonly Used Pandas Functions and How to use them

Pandas is one of the most important packages to grasp when you’re starting to learn Python

It is known for a very useful data structure called the pandas DataFrame. It also allows Python developers to easily deal with tabular data (like spreadsheets) within a Python script.

In this post, you will find frequently used Pandas features and I hope that you can use them to build data-driven Python applications today.

To use Pandas first you will have to import the library in your script.

`import pandas as pd`

For demonstration, I will be using ‘iris’ dataset and loading it into a dataframe.

Now let’s play with the dataset using Pandas features

**Accessing/selecting rows and columns in the dataset**

Pandas`loc`

and`iloc`

functions are used to select rows and columns from the dataset based on the labels or positions.

*loc*: select by labels*iloc*: select by positions

To access the first element of all columns we can use

This will return a Pandas series of the first index or row from the dataframe.

Similarly to get the first 5 elements of column *sepal_length*

To get the first 4 elements of the first 5columns we can use iloc

**2. Groupby function**

Pandas has a built-in `groupby`

function that allows you to group together rows based on a column and perform an aggregate function on the grouped dataset.

For example, you could calculate the mean of all rows using group by.

It is similar to the group by function in SQL language.

I have applied groupby function on column *species*

As you can see the result of groupby is a Pandas groupby object.

Now we can apply aggregate functions on this object to get the required results.

Similarly, we can apply other functions like min, std, etc. on any of the columns.

**3. Map function**

Pandas `map`

function applies changes to every element of a column

Here I am extracting an integer value before the decimal point from the column *sepal_length *using the split function* *

The extraction needs to be done on all the rows, so instead of iterating over the entire dataframe I can use map function and the output is assigned to a new column in the dataframe.

**4. Shape and Size**

Pandas `shape `

function is used to get the number of dimensions as well as the size in each dimension of a dataframe.

Since dataframes are two-dimensional, what shape returns is the number of rows and columns.

Pandas `size `

function as the name suggests returns the size of a dataframe which is the number of rows multiplied by the number of columns.

**5. Identifying missing values**

Identifying missing values is very important in Pandas as it can cause errors or miscalculations in further processing.

To check if the dataframe has any *null* values or *na *values we can use isnull() or isna() functions respectively.

On these functions, you can apply additional sum() or all()/any() functions to get the statistics of the missing values.

You can replace the *na* values by using function fillna()

**6. Querying the data**

Pandas also has a capability to filter the dataset based on a condition. To query the data we can directly add the filter conditions in loc.

Here I want to filter rows where *sepal_length* is greater than 7

**7. Sorting the data**

In Pandas we can sort the data by either rows or columns using function sort_values()

Here, I have sorted the dataframe on column *sepal_length* and printed its top 5 rows.

The default mode for sorting is ascending mode, you can change the mode by passing a parameter in sort_values function as *ascending=False*. This will sort the dataframe in descending mode.

I tried to collate all the functions of Pandas used on a day-to-day basis. I hope you will find something useful here. Thank you for reading till the end. And if you like my Blog please hit the clap button below.