pandas.DataFrame.describe — pandas 0.20.2 documentation (2024)

Generates descriptive statistics that summarize the central tendency,dispersion and shape of a dataset’s distribution, excludingNaN values.

Analyzes both numeric and object series, as wellas DataFrame column sets of mixed data types. The outputwill vary depending on what is provided. Refer to the notesbelow for more detail.

Parameters:

percentiles : list-like of numbers, optional

The percentiles to include in the output. All shouldfall between 0 and 1. The default is[.25, .5, .75], which returns the 25th, 50th, and75th percentiles.

include : ‘all’, list-like of dtypes or None (default), optional

A white list of data types to include in the result. Ignoredfor Series. Here are the options:

  • ‘all’ : All columns of the input will be included in the output.
  • A list-like of dtypes : Limits the results to theprovided data types.To limit the result to numeric types submitnumpy.number. To limit it instead to categoricalobjects submit the numpy.object data type. Stringscan also be used in the style ofselect_dtypes (e.g. df.describe(include=['O']))
  • None (default) : The result will include all numeric columns.

exclude : list-like of dtypes or None (default), optional,

A black list of data types to omit from the result. Ignoredfor Series. Here are the options:

  • A list-like of dtypes : Excludes the provided data typesfrom the result. To select numeric types submitnumpy.number. To select categorical objects submit the datatype numpy.object. Strings can also be used in the style ofselect_dtypes (e.g. df.describe(include=['O']))
  • None (default) : The result will exclude nothing.
Returns:

summary: Series/DataFrame of summary statistics

See also

DataFrame.count, DataFrame.max, DataFrame.min, DataFrame.mean, DataFrame.std, DataFrame.select_dtypes

Notes

For numeric data, the result’s index will include count,mean, std, min, max as well as lower, 50 andupper percentiles. By default the lower percentile is 25 and theupper percentile is 75. The 50 percentile is thesame as the median.

For object data (e.g. strings or timestamps), the result’s indexwill include count, unique, top, and freq. The topis the most common value. The freq is the most common value’sfrequency. Timestamps also include the first and last items.

If multiple object values have the highest count, then thecount and top results will be arbitrarily chosen fromamong those with the highest count.

For mixed data types provided via a DataFrame, the default is toreturn only an analysis of numeric columns. If include='all'is provided as an option, the result will include a union ofattributes of each type.

The include and exclude parameters can be used to limitwhich columns in a DataFrame are analyzed for the output.The parameters are ignored when analyzing a Series.

Examples

Describing a numeric Series.

>>> s = pd.Series([1, 2, 3])>>> s.describe()count 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0

Describing a categorical Series.

>>> s = pd.Series(['a', 'a', 'b', 'c'])>>> s.describe()count 4unique 3top afreq 2dtype: object

Describing a timestamp Series.

>>> s = pd.Series([...  np.datetime64("2000-01-01"),...  np.datetime64("2010-01-01"),...  np.datetime64("2010-01-01")... ])>>> s.describe()count 3unique 2top 2010-01-01 00:00:00freq 2first 2000-01-01 00:00:00last 2010-01-01 00:00:00dtype: object

Describing a DataFrame. By default only numeric fieldsare returned.

>>> df = pd.DataFrame([[1, 'a'], [2, 'b'], [3, 'c']],...  columns=['numeric', 'object'])>>> df.describe() numericcount 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0

Describing all columns of a DataFrame regardless of data type.

>>> df.describe(include='all') numeric objectcount 3.0 3unique NaN 3top NaN bfreq NaN 1mean 2.0 NaNstd 1.0 NaNmin 1.0 NaN25% 1.5 NaN50% 2.0 NaN75% 2.5 NaNmax 3.0 NaN

Describing a column from a DataFrame by accessing it asan attribute.

>>> df.numeric.describe()count 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0Name: numeric, dtype: float64

Including only numeric columns in a DataFrame description.

>>> df.describe(include=[np.number]) numericcount 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0

Including only string columns in a DataFrame description.

>>> df.describe(include=[np.object]) objectcount 3unique 3top bfreq 1

Excluding numeric columns from a DataFrame description.

>>> df.describe(exclude=[np.number]) objectcount 3unique 3top bfreq 1

Excluding object columns from a DataFrame description.

>>> df.describe(exclude=[np.object]) numericcount 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0
pandas.DataFrame.describe — pandas 0.20.2 documentation (2024)
Top Articles
How to stop spending money
How to Rank on Amazon
Craigslist Lake Of Ozarks Mo
Destiny Dental Cottage Grove
Hannahcpalmer Leaks
Jeff Bezos Lpsg
Kltv Weather Report
Hannah Palmer Of Leaked
Dawat Restaurant Novi
Greenland Market Las Vegas Weekly Ad
List of Amazon Fulfillment Center Locations [2024 Updated]
Sarah Colman-Livengood Park Raytown Photos
1 P.m. Pdt
Pokeclicker Pikablu
War Thunder M60
Memphis Cars For Sale Craigslist
Curtis Ingraham Net Worth
Bed Bath & Beyond, with 13 stores in Michigan, files for bankruptcy protection
Golfpro's BurgGolf Golfbanen - (Beter) leren golfen?
Montgomery County District Court Commissioner's Office
Un mal comienzo - Meghan Quinn
8009405707
What Year Did Cim Open Their Ipo
Guide to How Long A Tire Lasts
Federal Express Drop Off Center Near Me
Sam's Club Near Wisconsin Dells
Toro 21 Front Mount Dethatcher
Rek Funerals
Ixl Ld Northeast
Metropolitan State University of Denver
Vfr Town Of Salem
Labor Gigs On Craigslist
Buying Pool Routes: Advice, Mistakes and Lessons Learned
Ohio Road Construction Map
Ringcentral Background
Sandwich Based Flavor Of Lays Crossword
Tricare Dermatologists Near Me
Renfield Showtimes Near Cinemark North Haven And Xd
Craigslist Furniture Brownsville Tx
2021 Silverado 1500 Lug Nut Torque
Target hiring On - Demand Guest Advocate in Port Chester, NY | LinkedIn
Fapptime.cc
Why rivalry match between Pitt and Penn State volleyball is bigger than the Xs and Os
Kristenhart Cam
Lowe's Garden Fence Roll
Christopher Carlton Cumberbatch
Cnn Transcripts
St Edwards Bloomington Mn
Game Akin To Bingo Nyt
Panama City News Herald Obituary
Sound Of Freedom Showtimes Near Sperry's Moviehouse Holland
Latest Posts
Article information

Author: Trent Wehner

Last Updated:

Views: 5654

Rating: 4.6 / 5 (56 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Trent Wehner

Birthday: 1993-03-14

Address: 872 Kevin Squares, New Codyville, AK 01785-0416

Phone: +18698800304764

Job: Senior Farming Developer

Hobby: Paintball, Calligraphy, Hunting, Flying disc, Lapidary, Rafting, Inline skating

Introduction: My name is Trent Wehner, I am a talented, brainy, zealous, light, funny, gleaming, attractive person who loves writing and wants to share my knowledge and understanding with you.