Educational research suggests that the process of creating and solving statistical problems which interact with real data is best accomplished when the following four steps are followed:

 Four-Step Statistical Process:

1. Plan (Ask a question): formulate a statistical question that can be answered with data. A good deal of time should be given to this step as it is the most important step in the process.

Statistical Questions:
"How tall are the students in my class?"
"How many M&M are in each package?"
"How old are the students in Mrs. Smith's class?"
There will be a variety of correct answers to each of the questions listed above.
 Statistical questions anticipate variability in the answers.

Statistical questions expect the
correct
Not a statistical question: "How old is John?"
 Keep in mind: What is the objective? What are the best questions to ask? What group will be surveyed? Am I looking for a specific result?

2.
Collect (Produce Data):
design and implement a plan to collect appropriate data. Data can be collected through numerous methods, such as observations, interviews, questionnaires, databases, samplings or experimentation. Randomly collected data will yield the most reliable results and avoid bias.
 Keep in mind: What method will be used to collect the data? Will it be possible to access the entire population? Will a sample of the population be more realistic? How can a random sampling be accomplished?

3.
Process (Analyze the Data): organize and summarize the data by graphical or numerical methods. Graph numerical data using histograms, dot plots, and/or box plots, and analyze the strengths and weaknesses.
 Keep in mind: What charts or graphs will be used? What statistical information will help explain the data? What numerical computations are needed? How will strengths and weaknesses be determined?

4.
Discuss (Interpret the Results):
interpret your finding from the analysis of the data, in the context of the original problem. Give an interpretation of how the data answers your original questions. The data collected will have a "distribution" which can be described by its center, spread, and overall shape.
 Keep in mind: What conclusions can be drawn? Will conclusions extend to the entire population? Does the statistical data support the conclusions? How will the conclusions be presented?