The Computer Scientist’s Use of Statistics

computer scientist working on statistics

July 25, 2019

If you are contemplating or enrolled in a computer science program, you probably know that statistics play a role in computer science. What you may not be entirely aware of is the magnitude of the relationship between statistics and computer science. In Michigan Technological University’s blog “Why Computer Scientists can Benefit from Studying Applied Statistics,” many of the uses of statistics are delineated and described, particularly in terms of data science. Statistics (particularly applied statistics) are so critical in the work of a computer scientist that it is helpful to describe how they regularly come into play.

Computer Scientist: A Day in the Life

Let’s take a look at what a typical routine for a computer scientist might entail. Given the specific area of computer science (e.g., software development/engineering, programming, security, etc.), a computer scientist is apt to engage in the following:

Designing/maintaining information systems, hardware, and software
Data mining
Programming/Coding
Reporting on findings

It is fair to say that a day in the life of a computer scientist is one of knowledge discovery. But where do statistics come in? Statistics come in at very nearly every point, as knowledge discovery is central to statistics as well. Consider the above-listed endeavors–statistics informs much of them. For example, designing computer systems generally involves modeling and simulation.

Learn More About Statistics Program

Statistics in Modeling and Simulation

Very often, computer science models are based on probability; a solid understanding of probability theory is absolutely required. These models are based on statistical algorithms and are usually either used to predict or to describe. Statistics used in predictive modeling typically include regression (examining the effect of a variable or multiple variables on an outcome and the magnitude of the relationship between them) and classification techniques (such as decision trees and neural networks). These models are seen across industries. One example is marketing–predictive models are built to better understand consumer's buying behaviors. Consumers might also be categorized (seasonal, impulse, designer-driven, etc.). Once the model is built, the computer scientist should know how to employ statistical resampling methods to test their predictive power.

Computer scientists design and deploy simulations to test new systems before they are released into production. Such simulations may be used to test for specification conformance, reliability of the systems, and performance optimization. Statistical measures (probability, patterns, etc.) are at the heart of these simulations, often used to answer “what if” questions (i.e., “what if I add another component to the system”).

Statistics in Data Mining

Among the most critical statistics for computer science are those employed in data mining. Data mining is used to explore relationships and patterns within datasets. We are immersed in the era of big data–volumes of data from multiple sources in varying formats, and most readily and inexpensively accessible. With such access to vast stores of data, data literacy becomes key to a successful data mining mission. Anyone looking to collect or otherwise produce a meaningful dataset must understand the components of the data and the business or research question they intend to address.

Too often, data are culled without regard to sampling theory to inform the techniques. This typically results in data that aren't appropriate for the business question or sufficient to properly analyze. For example, it is essential to understand the distribution of the data. The data distribution clues us into the shape and variability of the dataset. Without such understanding, the wrong kind of analytical technique could be used, resulting in inaccurate and misrepresented results.

Descriptive statistics are frequently employed to investigate and understand a dataset prior to any form of analysis or modeling. Such statistics include:

Central tendency (mean, mode, median)
Dispersion (variance of the data)
The distribution

A computer scientist would want to be very comfortable with producing these statistics and understanding what they mean before proceeding with any type of analysis. The impact of one undetected outlier and results could be more significant than imagined.

Programming/Coding

As with understanding the data being analyzed, statistics inform the type of programming necessary to answer business questions. Statistical knowledge guides the programming itself (such as the algorithms developed), along with its testing and validation. Statistical know-how aids the computer scientist in creating more efficient and accurate programming for meaningful results.

Representing and Reporting

A frequently overlooked aspect of computer science is the representation and reporting of the data findings. However, this is just as crucial as the technical aspects of creating the results. This is another case where statistics and computer science merge–how to best visually display, summarize, and report results. A weak representation of even the most robust analysis will dampen and minimize its impact. A significant part of any statistician's training is reporting and communicating and is yet another reason why a computer scientist would want to reap the benefits of statistical study.

How to Learn Statistics for Computer Science

While some computer science programs may require a course in statistics, often that course alone is insufficient to provide the computer scientist with the well-rounded statistical skill set needed. The computer scientist who is serious about their career should strongly consider an additional program or degree in statistics. An applied statistics program with an emphasis on solving real-world problems that computer scientists face daily would be particularly relevant.

The good news is that there are online graduate programs such as Michigan Tech’s Master of Science in Applied Statistics. Michigan Tech’s program is geared toward professionals who are looking for a robust program that won’t take years to complete but offers the required knowledge to further a computer science career.

Learn More About Our Masters Program

Request Information

Submit your information to access a comprehensive digital program guide. One of our dedicated admissions advisors will contact you shortly to discuss next steps.

What would you like to study?

I have at least a bachelor's degree, from an ABET-accredited institution, as required for Michigan Tech's graduate programs.

First Name

Last Name

Phone

Country of Residence

Zip Code

Submitting this form constitutes your express written consent to be called and/or texted by Michigan Technological University at the number(s) you provided, regarding furthering your education. You understand that these calls may be generated using an automated technology, including by way of example, auto-dialer and click-to dial technologies. Calls may be recorded for quality assurance and training purposes. Privacy Policy.

X Close Form

Get Program Guide

Earn Your Master's in Applied Statistics Degree Online

Advance your career and meet the growing demand for high-paying statistics and data analytics positions.

An accelerated program designed for busy professionals

100% online: Manage coursework around your existing commitments
Learn industry-standard software: R, SAS, S-Plus, Python and more
Earn a Graduate Certificate in Applied Statistics along the way
Ten courses to complete, seven weeks each
Multiple starts are available throughout the year
GRE/GMAT not required. Free to apply.

Get Your Brochure

Applied statistics by the numbers

Employment of statisticians is predicted to grow by 35 percent from 2020 to 2030, much faster than the average for all occupations. BLS.gov

No. 3

Best Business Job of 2022
U.S. News & World Report

No. 14

Fastest growing job in the U.S.
BLS.gov

$95k

Median annual salary, 2022
BLS.gov

What our students say

"The teachers are amazing. They care immensely about the topics and your success. I've never felt more motivated to learn. My professors are in constant contact with me, which is invaluable."
- Keith Holliday | MS in Applied Statistics 2021

"One thing that stands out about Michigan Tech’s MSAS program is the program utilizes real-world data that isn't primed for easy analysis. Students must work with challenging datasets. I feel absolutely confident in my ability to gain meaningful employment and that the skills I learned have shaped me into an individual that can make a difference."
- Thomas Childers | MS in Applied Statistics 2021

"The support has been fantastic, through the whole application process and going through my courses. I've never felt I was all on my own and isolated. The other students have also been really supportive, even creating a sort of support group for when any of us needs help."
- Mary Lonz | MS in Applied Statistics 2021

Learn More

The Computer Scientist’s Use of Statistics

Computer Scientist: A Day in the Life

Statistics in Modeling and Simulation

Statistics in Data Mining

Programming/Coding

Representing and Reporting

How to Learn Statistics for Computer Science

Weather and Applied Statistics

A Complete Guide to Earning a Master’s in Applied Statistics

How to Succeed in a Graduate-Level Statistics Program

Online Programs

Why Michigan Tech Online?

Blog

Contact Us