25 Years of Programming
An open source source for C, C++, OWL, BASIC, MDB, XLS, DOT, and more...
Home   Projects   Up   Sitemap   Search   Blog   Forum+Chat   About Us   Privacy   Terms of Use   Feedback   FAQ   Images   Services   Payments   Humor   Music  

Online descriptive statistics calculator, with JavaScript code listing

This interactive calculator allows you to compute basic statistics on a set of numbers.

Currently supported are Count, Minimum, Maximum, Range, Sum, Arithmetic Mean (Average), Harmonic Mean, Geometric Mean, Median, Mode, and Variance and Standard Deviation using N and N-1 weighting.

The calculation is done in your browser by a JavaScript function. Its code listing is also on this page.

The JavaScript function takes as input an array whose elements contain values that can be extracted as numbers with parseFloat(). It returns an associative array whose appropriately-named elements contain the statistics about the input set.

You can copy the function's code below, or download (Save) the file (7 KB).

I also have C++ class statistical calculator that has many more features.

Online statistical calculator

Instructions:

  1. Type or copy and paste your list of numbers into the input box below. The data should contain no characters except .-+0123456789eE. All other characters will be converted to spaces.
  2. If for some reason you want to reverse the order of the numbers, click Reverse
  3. Click Calculate.
Paste
numbers,


Description Calculated
Value
Inputs OK: If any of the text could not be interpreted as numeric, this would show as False. However, because characters that are illegal in a number are removed before the list is processed, this will always be True, and a better validity check is to make sure the Count below matches the number of items you know are in the list. You can also check for illegal characters by clicking Reverse twice and checking to see if any of the numbers got changed.  
Count: The number of data items.  
Minimum: The lowest number.  
Maximum: The highest number.  
Range = Maximum - Minimum.  
Sum: All the numbers added together, N1 + N2 + N3...  
Arithmetic Mean, the  Average: = Sum / Count  
Harmonic Mean = Count / (1/N1 + 1/N2 + 1/N3...)  
Geometric Mean = (N1 * N2 * N3...)(1/Count)  
Population Standard Deviation (N weighting): The actual standard deviation of the numbers in your list, when they constitute the entire population (data set) under consideration. If your numbers are a sample (subset) of a larger set of numbers, use the Estimated Standard Deviation below. 

If the data set is "normally distributed" (its graph makes a "bell shaped curve"), about 68% of the values lie between -1SD and +1SD, about 95% are between +/-2SD, and about 99.7% are between +/-3SD.

 
Population Variance: the (N weighted) standard deviation, squared.  
Estimated Standard Deviation (N-1 weighting): When your set of numbers is a representative sample (subset) of a larger set of numbers, which you want to use as the basis of an estimate of the standard deviation of the whole population, it is common to use this Estimated Standard Deviation. Its use of (N-1) instead of (N) in the denominator of a calculation gives a slightly larger result. The reasoning is that there is more variance (from which the standard deviation is calculated) in a large set of numbers than in any smaller subset of it. Thus, the Population StdDev (N weighting, above) of a subset would be an underestimation of the true standard deviation of the larger population, so this (N-1) weighting is used instead.   
Estimated Variance: the (N-1 weighted) standard deviation, squared.  
Median (midpoint, middle point, C50): half the numbers in the list are above the median, and half below. If the number of data points is even, the median is an interpolated value halfway between the two middle data points.  
Mode: The most frequently occurring number(s) in the list. If there was only one input number, it is the mode. Otherwise, a data point can only be a mode if its frequency is > 1. If no value is shown, there is no legitimate modal value. If there are multiple values with the same frequency > 1, they are separated by commas.   
Mode Frequency: The number of occurrences of the modal value(s).  
Sum of (X2): Each number in the list is squared (X*X), and then all the squares are added together. Some statistics calculations use this.   
(Sum of X)2: All the numbers are added together (the Sum), and then the total is squared. Some statistics calculations use this.   

Descriptive Statistics JavaScript

 
/*	descriptivestatistics.js		12-31-2008		JavaScript
	Copyright (C)2008 Steven Whitney.
	Initially published by http://25yearsofprogramming.com.

	This program is free software; you can redistribute it and/or
	modify it under the terms of the GNU General Public License (GPL)
	Version 3 as published by the Free Software Foundation.
	This program is distributed in the hope that it will be useful,
	but WITHOUT ANY WARRANTY; without even the implied warranty of
	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
	GNU General Public License for more details.
	You should have received a copy of the GNU General Public License
	along with this program; if not, write to the Free Software
	Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.

Given an input set of values, it calculates and returns statistics describing the set.

It returns an associative array whose members can be accessed as either array elements or properties.
Examples: Stats["Count"] or Stats.Count
In the unlikely situation of iterating its members, only 
for(i in array) seems to work properly.
for(i = 0 ; i < array.length ; i++) 
apparently does not iterate associative elements, nor even recognize their existence.
	
*/
//----------------------------------------------------------------------------------------------
// Input must be an array with the data items in its successive elements.
// The return value is an associative array with the stats in its elements.
function DescriptiveStatistics(Input)
{
var Raw = new Array();					// an array for holding only the legitimate numeric values from Input
var Stats = new Array();				// the array for holding the statistics
var AllowGHMean = true;					// any input <= 0 makes calculation of the Geometric and Harmonic means invalid
Stats["Count"] = 0;						// N, the number of values in the input list
Stats["SumX"] = 0;						// Sum of all the X input values
Stats["SumX2"] = 0;						// Each X is squared, then all the squares are summed
Stats["Minimum"] = Number.NaN;			// Lowest value encountered
Stats["Maximum"] = Number.NaN;			// Highest value encountered
Stats["Range"] = Number.NaN;			// Highest - Lowest
Stats["StdDevPop"] = Number.NaN;		// Population Standard Deviation (N weighting)
Stats["StdDevEst"] = Number.NaN;		// Estimated Standard Deviation from sampled data (N-1 weighting)
Stats["ArithMean"] = Number.NaN;		// Arithmetic mean (average)
Stats["HarmonicMean"] = Number.NaN;		// Harmonic mean
Stats["GeometricMean"] = Number.NaN;	// Geometric mean
Stats["Median"] = Number.NaN;			// Median, C50, midpoint. Half the values fall above/below this value.
Stats["Modes"] = new Array();			// Modes, most frequent input value(s). It is an array because there can be > 1 mode.
Stats["ModeFrequency"] = 0;				// Number of occurrences of the modal value.
Stats["IsOk"] = true;					// True only if all input values were successfully parsed as numbers. 

// Could do this in two passes for better "numerical stability", 
// although lack of significant digits is hardly a likely problem.
// Pass 1: transfer the data from Input to Raw, then sort Raw from smallest absolute value to largest.
// Pass 2: do the math calculations 
var x, i, tally;
for(i = 0 ; i < Input.length ; i++)
{
	x = parseFloat(Input[i]);
	if(isNaN(x))
	{
		// Since failed values are ignored, the stats might be ok even if this flag is set, but this is a warning.
		Stats["IsOk"] = false;
	}
	else
	{
		Raw.push(x);
		if(x <= 0)
			AllowGHMean = false;
		// Delay initializing Min and Max until now so they remain NaN if there are no valid numbers in Input array.
		if(Stats["Count"] == 0)	
		{
			Stats["Minimum"] = Number.MAX_VALUE;
			Stats["Maximum"] = -(Number.MAX_VALUE);
		}
		Stats["Count"]++;
		Stats["SumX"] += x;
		Stats["SumX2"] += x * x;
		Stats["Minimum"] = Math.min(Stats["Minimum"],x);
		Stats["Maximum"] = Math.max(Stats["Maximum"],x);
	}
}
if(Stats["Count"] > 0)	
{
	Raw.sort(function(l,r){return l - r;});	// sort numerically for mode and median calculations

	Stats["Range"] = Stats["Maximum"] - Stats["Minimum"];
	Stats["ArithMean"] = Stats["SumX"] / Stats["Count"];
	// Pop calculation is always valid. If N==1, Pop and Est are both 0. If N>1, value of Est gets overwritten later.
	Stats["StdDevEst"] = Stats["StdDevPop"] = Math.sqrt((Stats["Count"] * Stats["SumX2"]) - (Stats["SumX"] * Stats["SumX"])) / Stats["Count"];
	Stats["Median"] = Raw[0];	// default value, for Count == 1; will be overridden if Count > 1
	
	if(AllowGHMean == true)
	{
		// Harmonic mean calculation 
		x = 0;
		for(i = 0 ; i < Stats["Count"] ; i++)
			x += (1 / Raw[i]);
		Stats["HarmonicMean"] = Stats["Count"] / x;
		
		// Geometric mean calculation 
		x = 1;
		for(i = 0 ; i < Stats["Count"] ; i++)
			x *= Math.pow(Raw[i], 1 / Stats["Count"]);	// this calc avoids math overflow
		Stats["GeometricMean"] = x;
	}

	// Mode calculation. Allows for multimodal data sets.
	x = Raw[0];								// each number encountered, initialized to first element
	tally = 1;								// tallies frequency of each; first element occurs at least once.
	for(i = 1 ; i < Stats["Count"] ; i++)
	{
		if(Raw[i] == x)							// if it's another occurrence,
			tally++;							// just increment the counter
		else                        			// else if we hit a new #,
		{										// first decide if the old number is a mode candidate.
			if(tally == Stats["ModeFrequency"])	// if tally is a tie, add number to the modes list
				Stats["Modes"].push(x);   		
			if(tally > Stats["ModeFrequency"])	// if there is a new higher frequency,
			{
				Stats["Modes"].length = 0;		// delete all previous mode candidates
				Stats["Modes"].push(x);   		// add this one to the list
				Stats["ModeFrequency"] = tally;	// and update the highest count counter
			}
			x = Raw[i];   						// now start tallying the new number
			tally = 1;							// it has already occurred once
		}
	}
	if(tally == Stats["ModeFrequency"])		// final check: maybe the last # was also a potential mode
		Stats["Modes"].push(x);   		
	if(tally > Stats["ModeFrequency"])		
	{
		Stats["Modes"].length = 0;
		Stats["Modes"].push(x);
		Stats["ModeFrequency"] = tally;
	}
}
if(Stats["Count"] > 1)	
{
	// Mode, continued: if there was only 1 input value, it's ok to let it be the mode,
	// but if there were multiple input values, minimum frequency for the mode is 2.
	if(Stats["ModeFrequency"] < 2)		
	{
		Stats["Modes"].length = 0;		// No legitimate mode found.
		Stats["ModeFrequency"] = 0;		// No occurrences.
	}

	// Estimated Standard Deviation is only valid when Count > 1, to avoid divide by zero.
	Stats["StdDevEst"] = 
		Math.sqrt(((Stats["Count"] * Stats["SumX2"]) - (Stats["SumX"] * Stats["SumX"])) / (Stats["Count"] * (Stats["Count"] - 1)));

	// Median calculation (midpoint of data points)
	i = Math.floor(Stats["Count"] / 2);		// in JavaScript, must explicitly truncate to integer
	if((Stats["Count"] % 2) == 1)			// if Count is odd, the center point is known
		Stats["Median"] = Raw[i];	
	else									// if Count is even, interpolate to get a "center" point
		Stats["Median"] = (Raw[i - 1] + Raw[i]) / 2;
}
return Stats;	
}
//----------------------------------------------------------------------------------------------

 

Valid HTML 4.01 Transitional
Yahoo! Search
Search the web Search this site
Valid CSS