Calc Guide 7.1
Chapter 9
Data Analysis
Using Scenarios, Goal Seek, Solver, Statistics, others
This document is Copyright © 2021 by the LibreOffice Documentation Team. Contributors are listed below. You may distribute it and/or modify it under the terms of either the GNU General Public License (http://www.gnu.org/licenses/gpl.html), version 3 or later, or the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), version 4.0 or later.
All trademarks within this guide belong to their legitimate owners.
Felipe Viggiano |
Kees Kriek |
|
Barbara Duprey |
Jean Hollis Weber |
John A Smith |
Kees Kriek |
Zachary Parliman |
Olivier Hallot |
Steve Fanning |
Leo Moons |
|
Please direct any comments or suggestions about this document to the Documentation Team’s mailing list: documentation@global.libreoffice.org.
Note
Everything you send to a mailing list, including your email address and any other personal information that is written in the message, is publicly archived and cannot be deleted.
Published May 2021. Based on LibreOffice 7.1 Community.
Other versions of LibreOffice may differ in appearance and functionality.
Some keystrokes and menu items are different on macOS from those used in Windows and Linux. The table below gives some common substitutions for the instructions in this book. For a more detailed list, see the application Help and Appendix A (Keyboard Shortcuts) to this guide.
Windows or Linux |
macOS equivalent |
Effect |
Tools > Options menu selection |
LibreOffice > Preferences |
Access setup options |
Right-click |
Control+click and/or right-click depending on computer setup |
Open a context menu |
Ctrl (Control) |
⌘ (Command) |
Used with other keys |
F11 |
⌘+T |
Open the Sidebar’s Styles deck |
Once you are familiar with functions and formulas, the next step is to learn how to use Calc’s automated processes to quickly perform useful analysis of your data.
In addition to formulas and functions, Calc includes several tools for processing your data. These tools include features for copying and reusing data, creating subtotals, running what-if analysis, and performing statistical analysis. They can be found under the Tools and Data menus on the Menu bar. Although they are not necessary when using Calc, they can help you save time and effort with handling large data sets or preserving your work for future review.
Note
A related tool, the Pivot Table, is not mentioned here, as it is complex enough to require its own chapter. See Chapter 8, Using Pivot Tables, for more information.
The Consolidate tool allows you to combine and aggregate data spread across one or more sheets. This tool is useful if you need to quickly summarize a large, scattered set of data for review. For example, you could use it to consolidate multiple department budgets from different sheets into a single company-wide budget contained in a master sheet.
To consolidate data:
1) Open the document containing the cell ranges to be consolidated.
2) Select Data > Consolidate on the Menu bar to open the Consolidate dialog (Figure 1).
3) Click on the Source data ranges field, then type a reference to a source data range, a named range, or select it with the mouse. Use the associated Shrink / Expand button if you need to minimize the dialog while you select the range. Alternatively, select a named range from the drop-down list to the left of the field.
4) Click Add. The selected range is added to the Consolidation ranges list.
5) Repeat steps 3 and 4 to add additional source ranges.
6) To delete an entry in the Consolidation ranges list, select it and click Delete. The deletion is carried out without further confirmation.
7) Click on the Copy results to field, then type a reference to the first cell of the target range or select it with your mouse. You can also select a named range in the drop-down list to the left of the field.
8) Select a function to aggregate your data in the Function drop-down list. The default is Sum. Other available functions are Count, Average, Max, Min, Product, Count (numbers only), StdDev (sample), StDevP (population), Var (sample), and VarP (population).
9) Click OK to consolidate the ranges. Calc runs the function from step 8 on your source data ranges and populates the target range with the results.
Tip
If you are consolidating the same cell ranges repeatedly, consider converting them into reusable named ranges to make the process easier. For more information about named ranges, see Chapter 13, Calc as a Database.
Figure 1: Consolidate dialog
In the Consolidate dialog, expand the Options section to access the settings shown in Figure 2.
Figure 2: Consolidate dialog – Options section
Consolidate by
Row labels – Consolidates rows by matching label. If this option is unchecked, the tool will consolidate rows by position instead.
Column labels – Works the same as Row labels, but with columns instead.
Options
Note
If you use the Link to source data option, each source link is inserted into the target range, then ordered and hidden from view. Only the final results of consolidation are displayed by default.
Figures Figure 3, Figure 4, and Figure 5 show a simple example of consolidation using a spreadsheet with the sheets Year 1, Year 2, and Consolidated Sales. Figure 3 shows the contents of the Year 1 sheet, with sales figures by region for each of the four product colors.
Figure 3: Year 1 sales by region
Figure 4 shows the Year 2 sheet, sales figures by region for each of four product colors. Note the different ordering of row and column labels between the two figures.
Figure 4: Year 2 sales by region
Figure 5 shows the consolidated sales data, created by using the Consolidate dialog settings shown in Figure 2. Note that because the Link to source data option was selected, clicking on the plus sign (+) indicators to the left of the data will reveal formula links back to the source ranges.
The source ranges and target range are saved as part of the document. If you later open a document with consolidated ranges, they will still be available in the Consolidation ranges list of the Consolidate dialog.
Figure 5: Consolidated sales by region
Calc offers two methods of creating subtotals: the SUBTOTAL function and the Subtotals tool.
The SUBTOTAL function is listed under the Mathematical category of the Function Wizard, and the Functions deck of the Sidebar, which are described in Chapter 7, Using Formulas and Functions. SUBTOTAL is a relatively limited method for generating a subtotal, and works best if used with only a few categories.
To illustrate how to use the SUBTOTAL function, we use the sales data sheet shown in Figure 6. The AutoFilter function is already applied to the sales data, as indicated by the down arrow buttons at the head of each column. AutoFilters are described in Chapter 2, Entering, Editing, and Formatting Data.
Figure 6: Sales data with AutoFilter applied (only the first few rows are shown)
To create a summation subtotal for the Sales Value field using the Function Wizard:
1) Select the cell to contain a subtotal. Typically, this cell is at the bottom of the column being subtotaled, which, for our example, is the Sales Value column.
2) Use one of the following methods to open the Function Wizard dialog (Figure 7):
Select Insert > Function on the Menu bar
Click the Function Wizard icon on the Formula bar
Press Ctrl+F2
Figure 7: Function Wizard dialog
3) Select SUBTOTAL in the Function list of the Function Wizard dialog and click Next.
4) Enter the numeric code of a function into the Function field on the right side of the dialog. This code must be a value in the range 1 to 11, or 101 to 111, with the meaning of each value shown in Table 1.
Note
Values 1 to 11 include hidden values in the calculated subtotal, while values 101 to 111 do not. Hiding and showing data is described in Chapter 2, Entering, Editing, and Formatting Data. Filtered-out cells are always excluded by the SUBTOTAL function.
Table 1: SUBTOTAL function numbers
Function index |
Function index |
Function |
1 |
101 |
AVERAGE |
2 |
102 |
COUNT |
3 |
103 |
COUNTA |
4 |
104 |
MAX |
5 |
105 |
MIN |
6 |
106 |
PRODUCT |
7 |
107 |
STDEV |
8 |
108 |
STDEVP |
9 |
109 |
SUM |
10 |
110 |
VAR |
11 |
111 |
VARP |
5) Click on the Range field, then type a reference to the Sales Value range or select the cells with your mouse (Figure 7). Use the Shrink / Expand button if you need to temporarily minimize the dialog while selecting the cells.
6) Click OK to close the Function Wizard dialog. The cell you selected in step 1 now contains the total sales value.
7) Click on the AutoFilter down arrow button at the top of the Employee column and remove all marks from the Standard Filter area except those next to Brigitte and (empty). The cell that you selected in step 1 should now reflect the sum of all of Brigitte’s sales (Figure 8).
Note
If the cell range used to calculate a subtotal contains other subtotals, these subtotals will not be counted in the final one. Similarly, if you use this function with AutoFilters, only the data satisfying the current filter selections will be displayed. Any filtered-out data is ignored.
Figure 8: SUBTOTAL result for Brigitte’s sales
Calc offers the Subtotals tool as a more comprehensive alternative to the SUBTOTAL function. In contrast to SUBTOTAL, which only works on a single array, the Subtotals tool can create subtotals for up to three arrays arranged in labeled columns. It also groups subtotals by category and sorts them automatically, thereby eliminating the need to apply AutoFilters and filter categories by hand.
To insert subtotal values into a sheet:
1) Select the cell range for the subtotals that you want to calculate, and remember to include the column heading labels. Alternatively, click on a single cell within your data to allow Calc to automatically identify the range.
2) Select Data > Subtotals on the Menu bar to open the Subtotals dialog (Figure 9).
3) In the Group by drop-down list on the 1st Group tab, select a column by its label. Entries in the cell range from step 1 will be grouped and sorted by matching values in this column.
4) In the Calculate subtotals for box on the 1st Group tab, select a column containing values to be subtotaled. If you later change values in this column, Calc will automatically recalculate the subtotals.
5) In the Use function box on the 1st Group tab, select a function to calculate the subtotals for the column selected in step 4.
6) Repeat steps 4 and 5 to create subtotals for other columns on the 1st Group tab.
7) You can create two more subtotal categories by using the 2nd Group and 3rd Group tabs and repeating steps 3 to 6. If you do not want to add more groups, then leave the Group by list for each page set to “- none -”.
8) Click OK. Calc will add subtotal and grand total rows to your cell range.
Figure 9: Subtotals dialog
For our sales data example, a partial view of the results is shown in Figure 10. The group settings are identified in Table 2.
Table 2: Group settings used on Subtotals dialog for example sales data
Tab |
Group by |
Calculate subtotals for |
Use function |
1st Group |
Employee |
Sales Value |
Sum |
2nd Group |
Category |
Sales Value |
Sum |
3rd Group |
- none - |
- |
- |
When you use the Subtotals tool, Calc inserts an outline to the left of the row number column. This outline represents the hierarchical structure of your subtotals, and can be used to hide or show data at different levels in the hierarchy using the numbered column indicators at the top of the outline or the group indicators, denoted by plus (+) and minus (-) signs.
This feature is useful if you have many subtotals, as you can simply hide low-level details, such as individual entries, to produce a high-level summary of your data. For more information on how to use outlines, see Chapter 2, Entering, Editing, and Formatting Data.
To turn off outlines, select Data > Group and Outline > Remove Outline on the Menu bar. To reinstate them, select Data > Group and Outline > AutoOutline.
Figure 10 shows the outline for our sales data example.
Figure 10: Partial outlined view of sales data example with subtotals
Column 1 represents the highest group level, the grand total over all employees. Columns 2 to 5 show descending group levels as follows:
Column 2 represents the grand total over all categories.
Column 3 represents the total for each employee.
Column 4 represents the total for each category for an individual employee.
Column 5 shows individual entries.
Click on the Options tab of the Subtotals dialog to access the following settings:
Groups
Page break between groups – inserts page breaks between each subtotal group so that each group displays on a separate page when you print the data.
Case sensitive – prevents the tool from grouping entries by data labels that differ by case. In our sales data example, entries with “Brigitte” and “brigitte” under the Employee column will not match if this option is selected.
Pre-sort area according to groups – sorts entries by group before calculating subtotals. Disabling this option prevents the tool from grouping matching entries together. As a result, distinct subtotals will be created for matching entries if they do not appear on consecutive rows. For example, two entries under the “Golf” category will not count towards the same group subtotal if there is an entry for “Tennis” in between them.
Figure 11: Options tab of the Subtotals dialog
Sort
Ascending or Descending – sorts entries by value from lowest to highest and highest to lowest, respectively. You can modify these sort rules by using Data > Sort on the Menu bar. For more detail, see Chapter 2, Entering, Editing, and Formatting Data.
Include formats – carries over formatting, such as the currency format, from the data to the corresponding subtotals.
Custom sort order – sorts your data according to one of the predefined custom sorts defined in Tools > Options > LibreOffice Calc > Sort Lists on the Menu bar. For more details about custom sort lists, see Chapter 2, Entering, Editing, and Formatting Data.
In the Subtotals dialog, use the Reset button to undo any changes made on the current tab. Use the Remove button to remove any subtotals that have already been created using the Subtotals tool. Use these features with care, as no confirmation dialogs will be displayed.
Scenarios are saved, named cell ranges that you can use to answer “what-if” questions about your data. You can create multiple scenarios for the same calculation set, then quickly swap between them to view the outcomes of each. This feature is useful if you need to test the effects of different conditions on your calculations, but do not want to deal with repetitive manual data entry. For example, if you wanted to test different interest rates for an investment, you could create scenarios for each rate, then switch between them to find out which rates work the best for you.
To create a new scenario:
1) Select the cells that contain the values that will change between scenarios. To select multiple ranges, hold down the Ctrl key as you click. You must select at least two cells.
2) Choose Tools > Scenarios on the Menu bar to open the Create Scenario dialog (Figure 12).
Figure 12: Create Scenario dialog
3) Enter a name for the new scenario in the Name of Scenario field.
Tip
For each scenario you create, use a unique name that clearly identifies and distinguishes it. This practice will save you time and headache if you have to work with a large set of scenarios. We do not recommend using the default name suggested by Calc.
4) Optionally, add information to the Comment field. The example in Figure 12 shows the default comment.
5) Click OK to close the dialog. The new scenario is automatically activated upon creation.
6) Repeat steps 1 to 5 to create additional scenarios. Select the same cell range that you used for the first scenario to have multiple scenarios for the same calculations.
Tip
To keep track of what calculations are dependent on your scenarios, use Tools > Detective > Trace Dependents on the Menu bar after highlighting your scenario cells. Arrows will point from your scenario cells to the dependent formula cells. For more information about the Detective tool, see Chapter 7, Using Formulas and Functions.
The Settings section of the Create Scenario dialog contains the following options:
Figure 13: Create Scenario dialog – Settings section
Display border
Figure 14: Scenario cell range with border
Copy back
Caution
When creating a new scenario from the cells of a scenario with Copy back enabled, be careful not to overwrite the old scenario. To avoid this situation, create the new scenario with Copy back enabled first, then change its values only once it is active.
Copy entire sheet
Prevent changes
Scenarios have two aspects that can be altered independently:
Scenario properties (that is, its settings)
Scenario cell values
The extent to which these aspects can be changed depends on the active scenario's properties and the current sheet and cell protections. For more detail about sheet and cell protections, see Chapter 2, Entering, Editing, and Formatting Data.
Table 3 summarizes how sheet protection and the Prevent changes option affect your ability to change scenario properties.
Table 3: Changing scenario properties
Sheet protection |
Prevent changes |
Property changes |
On |
On |
No scenario properties can be changed. |
On |
Off |
Display border and Copy back can be changed. Prevent changes and Copy entire sheet cannot be changed. |
Off |
Any setting |
All scenario parameters except for Copy entire sheet can be changed. In this case, the Prevent changes option has no effect. |
Table 4 summarizes the interaction of various settings in making changes to scenario cell values.
Table 4: Changing scenario cell values
Sheet protection |
Scenario cell protection |
Prevent changes |
Copy back |
Change allowed |
On |
Off |
On |
On |
Scenario cell values cannot be changed. |
On |
Off |
Off |
On |
Scenario cell values can be changed, and the scenario is updated. |
On |
Off |
Any setting |
Off |
Scenario cell values can be changed, but the scenario is not updated due to the Copy back setting. |
On |
On |
Any setting |
Any setting |
Scenario cell values cannot be changed. |
Off |
Any setting |
Any setting |
Any setting |
Scenario cell values can be changed and the scenario is updated or not, depending on the Copy back setting. |
After you add scenarios to a spreadsheet, you can view a particular scenario by using the Navigator. Open it by selecting View > Navigator on the Menu bar, then click on the Scenarios icon in the Navigator and select a scenario in the list (Figure 15). Any defined scenarios are listed along with the comments that were entered when each scenario was created. You can also use the equivalent facilities in the Navigator deck of the Sidebar. For more information about the Navigator, see Chapter 1, Introduction.
To apply a scenario to the current sheet, double-click the scenario name in the Navigator.
To delete a scenario, right-click the name in the Navigator and choose Delete, or press Delete after selecting it. A confirmation dialog will be displayed.
To edit a scenario, right-click the name in the Navigator and choose Properties. Calc displays the Edit Scenario dialog, which is similar to the Create Scenario dialog (Figure 12).
Figure 15: Scenarios in the Navigator
Like scenarios, the Multiple Operations tool performs what-if analysis on your calculations. Unlike scenarios, which represent individual sets of values for multiple formula variables, this tool uses a whole range of values for just one or two variables. It then uses one or more formulas to create a matching range of solutions. Since each solution corresponds to one or two variable values, both the variable and solution ranges can be easily arrayed in tabular format. As a result, the Multiple Operations tool is well-suited to generating data that is easy to read and share or visualize using graphs.
Tip
Exercising good organization can make using this tool relatively painless. For example, we recommend keeping your data together on one sheet and using labels to identify your formulas, variables, and table ranges.
The easiest way to learn how to use the Multiple Operations tool is with one formula and one variable. To learn how to use the tool with multiple formulas or with two variables, see “Calculating with several formulas simultaneously” below and “Multiple operations with two variables” below, respectively.
To use the Multiple Operations tool with one formula and one variable:
1) In the cells of a worksheet, enter a formula and at least one variable that it uses.
2) In the same worksheet, enter values into a cell range that occupies a single column or row. These values will be used for one of the variables of the formula that you defined in step 1.
3) With the mouse, select the range containing both the variable range that you defined in step 2 and the adjacent empty cells that follow it. Depending on how your variable range is arrayed, these empty cells will either be in the column to the right (if the range is in a column) or in the row immediately below (if it is in a row).
4) Select Data > Multiple Operations on the Menu bar to open the Multiple Operations dialog (Figure 16).
Figure 16: Multiple Operations dialog
5) Click on the Formulas field and type a cell reference to the formula you defined in step 1 or select the cell with the mouse. Use the associated Shrink / Expand button if you need to minimize the dialog while selecting the cell.
6) If the range from step 2 is arrayed in a column, then click on the Column input cell field and type a cell reference to the variable that you want to use or select the cell with the mouse. If the range is in a row, then use the Row input cell field instead.
7) Click OK to run the tool. The Multiple Operations tool will generate its results in the empty cells that you selected in step 3. Each result value corresponds to the variable value adjacent to it, and together they form the entries of a results table.
Using the Multiple Operations tool is best explained by example. Suppose that you produce toys that you sell for $10 each (cell B1 of a worksheet). Each toy costs $2 to make (B2), and you have a fixed annual cost of $10,000 (B3). What is the minimum number of toys that you must sell to break even? Suppose that our initial estimate of quantity sold is 2,000 (B4).
To answer this question:
1) Enter the following formula into B5: =B4*(B1-B2)-B3. This formula represents the equation Profit = Quantity * (Selling price – Direct costs) – Fixed costs. With this equation, our initial quantity produces a $6,000 profit, which is higher than the break-even point.
2) In D2:D11, enter a range of alternate quantities from 500 to 5000 in steps of 500.
3) Select the range D2:E11 to define the results table. This range includes the alternate quantity values (column D) and the empty results cells (column E).
4) Select Data > Multiple Operations on the Menu bar to open the Multiple Operations dialog.
5) Using the Formulas field, select the cell B5.
6) Using the Column input cell field, select the cell B4 to set the quantity as the variable for our calculations. Figure 17 shows the worksheet and Multiple Operations dialog at this point.
Figure 17: Inputs to Multiple Operations tool for one formula, one variable
7) Click OK. The profits for the different quantities are now shown in column E (Figure 18). We can see that the break-even point is between 1000 and 1500 toys sold – namely, 1250. Figure 18 shows an XY (Scatter) chart showing the profit as a function of quantity.
Figure 18: Results of Multiple Operations tool for one formula and one variable
Figure 19: XY (Scatter) plot of profit over quantity of toys sold (example of visualization)
Using the Multiple Operations tool with multiple formulas follows nearly the same process as with one formula, but with two important differences:
1) For each formula that you add, you must also add a corresponding column or row to the results table to contain the output of that formula.
2) How you initially arrange your formulas determines how their results will be displayed in the results table. For example, if you arrange the formulas A, B, and C in a single row in that order, then Calc will generate the results of A in the first results table column, the results of B in the second column, and the results of C in the third.
Note
The Multiple Operations tool only accepts formulas arranged in a single row or column, depending on how your results table is oriented. If the table is column-oriented – that is, the way it is in our sales data example – then your formulas must be arranged in a row. If the table is row-oriented, then your formulas must be in a column.
Caution
Be careful not to add empty cells between formulas, as they will create gaps in the results table and may cause some results not to appear if you don't select enough rows or columns for the table.
Using our sales data example, suppose that we want to calculate the annual profit per item sold in addition to the annual overall profit. To calculate the results:
1) In the sheet from the previous example, delete the results in column E.
2) Enter the following formula in C5: =B5/B4. You are now calculating the annual profit per item sold.
3) Select the range D2:F11 for the results table. Column F will contain the results of the annual profit per item formula in C5.
4) Select Data > Multiple Operations on the Menu bar to open the Multiple Operations dialog.
5) Using the Formulas field, select the range B5:C5.
6) Using the Column input cell field, select the cell B4. Figure 20 shows the worksheet and the dialog at this point.
Figure 20: Inputs to Multiple Operations tool for one variable and two formulas
7) Click OK. Now the profits are listed in column E and the annual profit per item in column F.
Figure 21: Results of Multiple Operations tool for one variable and two formulas
When you use the Multiple Operations tool with two variables, it creates a two-dimensional results table. Each variable defines one of the table’s dimensions such that the alternate values for both variables serve as the table’s row and column headings, respectively. Each cell in the table corresponds to a distinct pair of row and column heading values. In turn, the results in each cell are created from these values for both variables.
Since you are using two variables, you must use both the Column input cell and Row input cell dialog fields to define them. The order is important; the Column input cell field corresponds to the row heading values, while the Row input cell field corresponds to the column heading values.
Tip
A good guideline to remember is that since column headings are in a row at the top of the table, they correspond to the Row input cell field. Likewise, row headings are in a column, so they correspond to the Column input cell field.
Note
If you use two variables, the Multiple Operations tool will not work with multiple formulas. It will allow you to enter the extra formulas, but will not generate the expected results for any formula beyond the first.
Using our sales example, suppose that in addition to varying the quantity of toys sold, you also want to vary the unit sale price as well. To calculate the results:
1) Expand the sales data table by entering $8, $10, $15 and $20 in the range E1:H1.
2) Select the range D1:H11 for the results table.
3) Select Data > Multiple Operations on the Menu bar to open the Multiple Operations dialog.
4) Using the Formulas field, select cell B5.
5) Using the Row input cell field, select cell B1. The column headings – $8, $10, $15 and $20 – are now linked to the unit sale price variable defined in cell B1.
6) Using the Column input cell field, select cell B4. The row headings – 500, 1000, ... , 5000 – are now linked to the quantity sold variable defined in cell B4. Figure 22 shows the worksheet and dialog at this point.
7) Click OK. The profits for the different sale prices and quantities are now shown in the range E2:H11 (Figure 22).
Figure 22: Inputs to Multiple Operations tool for two variables
Figure 23: Results of Multiple Operations tool for two variables
In addition to scenarios and the Multiple Operations tool, Calc has a third “what-if” analysis tool: Goal Seek. Usually, you use a formula to calculate a result from existing values. In contrast, with Goal Seek, you work backwards from a result to discover what values produce it. This feature is useful if you already know the outcome you want, but need to answer questions such as how to reach it or how it could be changed if you altered conditions.
Note
Only one argument can be altered at a time in a single goal seek. If you need to test multiple arguments, then you must run a separate goal seek on each one.
To illustrate how to use Goal Seek, suppose that we want to calculate the annual interest return for an account. To calculate annual interest (I), we must create a table with values for the capital (C), the interest period length in years (n), and the interest rate (i). The formula is I = C*n*i.
Suppose that the interest rate i = 7.5% (cell B3 of a worksheet) and the period length n = 1 (B2) remain constant. We want to know how much investment capital C is needed to achieve a return of I = $15,000. Assume that our initial capital estimate is C = $100,000 (B1).
To calculate the return:
1) Enter the return formula (=B1*B2*B3) into B4 and select the cell with the mouse.
2) Select Tools > Goal Seek on the Menu bar to open the Goal Seek dialog (Figure 24).
Figure 24: Goal Seek dialog
3) B4 should already be entered in the Formula cell field. However, if you want to select a different cell, use the associated Shrink / Expand button to minimize the dialog while you select the required cell.
4) Click on the Variable cell field, then type a reference to cell B1 or select it with the mouse to make the capital the variable in the current Goal Seek.
5) Enter the desired formula result in the Target value field. In this example, the value is 15000. Figure 25 shows the cells and dialog fields at this point.
Figure 25: Example setup for goal seek
6) Click OK. A dialog appears informing you that the goal seek was successful (Figure 26).
Figure 26: Goal seek result dialog
7) Click Yes to enter the goal value into the variable cell. The result is shown in Figure 27, indicating that a capital requirement of $200,000 is needed to achieve a $15,000 return.
Figure 27: Result of goal seek in worksheet
Note
Not every goal seek problem succeeds in returning a good result. It depends on the formula used, goal value, and initial value. The goal seek algorithm iterates internally several times converging to the goal.
If the goal seek is unsuccessful, Calc displays an information dialog reporting the failure. This dialog offers the choice of inserting the closest value into the variable cell. Press Yes or No as required.
The Solver amounts to a more elaborate form of goal seek, which allows you to solve mathematical programming or optimization problems. A mathematical programming problem is concerned with minimizing or maximizing a function subject to a set of constraints. Such problems arise in many scientific, engineering, business, and other disciplines. A full discussion of mathematical programming is beyond the scope of this guide and the interested reader is referred to the relevant Wikipedia page at https://en.wikipedia.org/wiki/Mathematical_optimization, which provides high level information and references to more detailed materials.
Currently Calc offers the following selection of solver engines:
DEPS (Differential Evolution & Particle Swarm Optimization) Evolutionary Algorithm.
SCO (Social Cognitive Optimization) Evolutionary Algorithm.
LibreOffice CoinMP Linear Solver.
LibreOffice Linear Solver.
LibreOffice Swarm Non-Linear Solver (experimental).
Caution
As the LibreOffice Swarm Non-Linear Solver is an experimental tool, it may not be supported in future versions of Calc, and we recommend that you do not use it unless you are familiar with non-linear programming concepts.
The DEPS and SCO Evolutionary Algorithms are intended for solving non-linear problems - they are only available if you have a Java runtime environment installed on your computer and have enabled the Tools > Options > LibreOffice > Advanced > Use a Java runtime environment configuration option. The DEPS Evolutionary Algorithm is the initial default if it is available, while LibreOffice CoinMP Linear Solver is the initial default otherwise.
The available options provide flexibility to choose the most suitable algorithm for a given problem, which may be linear or non-linear, and a given performance requirement. The Help system contains much more information about the available algorithms and their configuration options.
In order to use the Solver to solve a mathematical programming problem, you must formulate the problem as follows:
Decision variables – a set of n non-negative variables x1, … , xn,. Decision variables may be real numbers, but generally tend to be integers in many real world problems.
Constraints – a set of linear equalities or inequalities involving the decision variables.
Objective function – a linear expression involving the decision variables.
The goal is usually to find values of the decision variables that satisfy the constraints and maximize or minimize the result of the objective function.
After setting up the data for the problem in your Calc spreadsheet, select Tools > Solver on the Menu bar to open the Solver dialog (Figure 28).
Note
Depending on the configuration of your computer, a message may be displayed the first time that you select Tools > Solver after starting Calc. The nature of this message will change dependent on the existence of a Java runtime environment (JRE) on your system. If no JRE is detected, the message will simply be a warning to that effect. In the case where a JRE is detected but the Tools > Options > LibreOffice > Advanced > Use a Java runtime environment option is disabled, then the message will include a button to enable that option.
Target cell
Optimize result to
By changing cells
Limiting Conditions
Cell reference – enter a cell reference to a decision variable.
Operator – defines a parameter for a constraint. Available options include <= (less than or equal to), = (equal to), => (greater than or equal to), Integer (values without decimals), and Binary (only 0 or 1).
Value – enter a value or a cell reference to a constraint formula.
Remove button – deletes the currently-defined constraint.
Figure 28: Solver dialog
Tip
Remember that for some of these options, you can minimize the Solver dialog using the associated Shrink / Expand buttons if you need to select cells with the mouse.
Once you have finished setting up the Solver, click the Solve button to begin the process of adjusting values and calculating results. Depending on the complexity of the task, this may take some time. If you want to start again, click the Reset All button and the data inserted in the Solver dialog (Figure 28) will be cleared.
On successful completion, Calc presents a Solving Result dialog (Figure 29). This dialog includes buttons to save (Keep Result) or discard (Restore Previous) your results.
Figure 29: Solving Result dialog
The Solver dialog also has an Options button, which opens the Options dialog shown in Figure 30.
Figure 30: Solver Options dialog
Solver engine
DEPS Evolutionary Algorithm
SCO Evolutionary Algorithm
LibreOffice CoinMP Linear Solver
LibreOffice Linear Solver
LibreOffice Swarm Non-Linear Solver (experimental)
Settings
Suppose that you have $10,000 that you want to invest in two mutual funds for one year. Fund X is a low risk fund with an 8% interest rate and Fund Y is a higher risk fund with a 12% interest rate. How much money should be invested in each fund to earn a total interest of $1,000?
To find the answer using the Solver:
1) Enter the following labels and data into a worksheet:
Row labels: Fund X, Fund Y, and Total in cells A2, A3, and A4.
Column labels: Interest earned, Amount invested, Interest rate, and Time period in cells B1 thru E1.
Interest rates: 8% and 12% in cells D2 and D3.
Time period: 1 in cells E2 and E3.
Total amount invested: $10000 in cell C4.
Enter an arbitrary value ($0 or leave blank) in cell C2 as the amount invested in Fund X.
2) Enter the following formulas:
In cell C3, enter the formula =C4–C2 (total amount – amount invested in Fund X) as the amount invested in Fund Y.
In cells B2 and B3, enter the formulas =C2*D2*E2 (B2) and =C3*D3*E3 (B3).
In cell B4, enter the formula =B2+B3 as the total interest earned. Figure 31 shows the worksheet at this point.
Figure 31: Solver example setup
3) Select Tools > Solver on the Menu bar to open the Solver dialog (Figure 28).
4) Using the Target cell field, select the cell that contains the target value. In this example, it is B4, which contains the total interest value.
5) Select Value of and enter 1000 in the field next to it. In this example, the target cell value is 1000 because your target is a total interest earned of $1,000.
6) Using the By changing cells field, select cell C2 in the sheet. In this example, you need to find the amount invested in Fund X (cell C2).
7) Enter the following limiting conditions for the variables by using the Cell reference, Operator, and Value fields:
C2 <= C4 – the amount invested in Fund X cannot exceed the total amount available.
C2 => 0 – the amount invested in Fund X cannot be negative.
C2 is an Integer – specified for convenience.
8) Click Solve. The result is shown in Figure 32.
Figure 32: Solver example result
Under Data > Statistics on the Menu bar, Calc features several tools for quick and easy statistical analysis of your data. These tools include:
Sampling
Descriptive Statistics
Analysis of Variance (ANOVA)
Correlation
Covariance
Exponential Smoothing
Moving Average
Regression
Paired t-test
F-test
Z-test
Test of Independence (Chi-square)
Fourier Analysis
The Sampling tool creates a target table with data sampled from a source table. The sampling tool can pick samples randomly or on a periodic basis. Sampling is done row-wise, with whole rows of the source table copied into rows of the target table. To use this tool, select Data > Statistics > Sampling on the Menu bar to access the Sampling dialog (Figure 33).
Figure 33: Sampling dialog
Input range
Results to
Random
Sample size
With replacement
Keep order
Periodic
Period
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
Figure 34 shows the source table (below the Source Data heading) and the corresponding target table (under the Target Data heading), sampled using the settings shown in Figure 33.
Figure 34: Example data for the Sampling tool
Given a set of data, the Descriptive Statistics tool creates a tabular report of the data set’s primary statistical properties, such as information about its central tendency and variability. Select Data > Statistics > Descriptive Statistics on the Menu bar to access the Descriptive Statistics dialog (Figure 35).
Figure 35: Descriptive Statistics dialog
Input range
Results to
Columns / Rows
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
Figure 36 shows a small data set comprising student examination results in three subjects.
Figure 36: Input data for descriptive statistics analysis
Figure 37 shows the statistics report generated for this input data using the settings shown in Figure 35.
Figure 37: Results from Descriptive Statistics tool
Tip
For more information on descriptive statistics, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Descriptive_statistics.
The Analysis of Variance (ANOVA) tool compares the means of two or more groups in a sample. Select Data > Statistics > Analysis of Variance (ANOVA) on the Menu bar to access the Analysis of Variance (ANOVA) dialog (Figure 38).
Figure 38: Analysis of Variance (ANOVA) dialog
Input range
Results to
Single / Two factor
Columns / Rows
Alpha
Rows per sample
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
To illustrate how to use this tool, we use the input data set from Figure 36. Figure 39 shows the analysis of variance results generated for this data using the settings shown in Figure 38.
Figure 39: Results from Analysis of Variance (ANOVA) tool
Tip
For more information on analysis of variance, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Analysis_of_variance.
The Correlation tool calculates the correlation of two sets of numeric data and generates the resulting correlation coefficient. This coefficient is a value between -1 and +1 that indicates how strongly two variables are related to each other. A correlation coefficient of +1 indicates a perfect positive correlation (the data sets match) and a coefficient of -1 indicates a perfect negative correlation (the data sets are inverse to each other). Select Data > Statistics > Correlation on the Menu bar to access the Correlation dialog (Figure 40).
Figure 40: Correlation dialog
Input range
Results to
Columns / Rows
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
To illustrate how to use this tool, we again use the data set from Figure 36. Figure 41 shows the six correlation coefficients generated for this input data using the settings shown in Figure 40.
Figure 41: Correlation results
Tip
For more information on statistical correlation, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Correlation_and_dependence.
The Covariance tool measures how much two sets of numeric data vary together. Select Data > Statistics > Covariance on the Menu bar to access the Covariance dialog (Figure 42).
Figure 42: Covariance dialog
Input range
Results to
Columns / Rows
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
To illustrate how to use this tool, we again use the data set from Figure 36. Figure 43 shows the six covariance values generated for this input data using the settings shown in Figure 42.
Figure 43: Covariance results
Tip
For more information on statistical covariance, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Covariance.
The Exponential Smoothing tool filters a data set to produce smoothed results. It is used in domains such as stock market analysis and in sampled measurements. Select Data > Statistics > Exponential Smoothing on the Menu bar to access the Exponential Smoothing dialog (Figure 44).
Figure 44: Exponential Smoothing dialog
Input range
Results to
Columns / Rows
Smoothing factor
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
To illustrate how to use this tool, we use the data set shown in Figure 45. The table has two time series representing impulse functions at times t=0 and t=2.
Figure 45: Input data set for exponential smoothing example
Figure 46 shows the smoothed results for this input data using the settings shown in Figure 44.
Figure 46: Results from Exponential Smoothing tool
Tip
For more information on exponential smoothing, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Exponential_smoothing.
The Moving Average tool calculates the moving average of a time series data set. Select Data > Statistics > Moving Average on the Menu bar to access the Moving Average dialog (Figure 47).
Figure 47: Moving Average dialog
Input range
Results to
Trim input range to actual data content
Columns / Rows
Interval
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
To illustrate how to use this tool, we again use the data set from Figure 45. Figure 48 shows the moving averages calculated for this input data using the settings shown in Figure 47.
Figure 48: Calculated moving averages
Tip
For more information on the moving average, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Moving_average.
The Regression tool analyzes the relationship in a data set between one or more independent variables and a dependent variable. Select Data > Statistics > Regression on the Menu bar to access the Regression dialog (Figure 49).
Independent variable(s) (X) range
Dependent variable (Y) range
Both X and Y ranges have labels
Results to
Columns / Rows
Linear Regression
Logarithmic Regression
Power Regression
Confidence level
Calculate residuals
Force intercept to be zero
Figure 49: Regression dialog
Tip
Use the Shrink / Expand buttons next to the Independent variable(s) (X) range, Dependent variable (Y) range, and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
Tip
Calc utilizes the small, otherwise blank area above the Help, OK, and Cancel buttons to provide feedback on erroneous selections on the dialog. For example, the text “Independent variable(s) range is not valid.” appears if you have not entered a valid cell range in the Independent variable(s) (X) range field, and in this circumstance the OK button is grayed.
To illustrate how to use this tool, we use the data set shown in Figure 50. This table contains measurements taken at 1 second intervals.
Figure 50: Input data set for regression analysis
Figure 51 shows the regression outputs calculated for this input data using the settings shown in Figure 49.
Tip
For more information on regression analysis, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Regression_analysis.
Figure 51: Linear regression outputs
The Paired t-test tool compares the population means of two related sample sets and determines the difference between them. Select Data > Statistics > Paired t-test on the Menu bar to access the Paired t-test dialog (Figure 52).
Variable 1 range
Variable 2 range
Results to
Columns / Rows
Tip
Use the Shrink / Expand buttons next to the Variable 1 range, Variable 2 range, and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
Figure 52: Paired t-test dialog
To provide an example of using this tool, we make use of the input data set shown in Figure 53. The data sets in columns A and B represent two sets of paired values referred to as Variable 1 and Variable 2.
Figure 53: Input data for paired t-test example
Figure 54 shows the paired t-test results calculated for this input data using the settings shown in Figure 52.
It is possible to insert different values for Alpha and Hypothesized Mean Difference. The t values (Stat, Critical one-tail, and Critical two-tail) will be updated automatically.
Tip
For more information on paired t-tests, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Student's_t-test.
Figure 54: Results from Paired t-test tool
The F-test tool calculates the F-test of two data samples. The tool is used to test the hypothesis that the variance of two populations are equal. Click Data > Statistics > F-test on the Menu bar to access the F-test dialog shown in Figure 55 and define the required inputs to the tool.
Figure 55: F-test dialog
Variable 1 range
Variable 2 range
Results to
Columns / Rows
Tip
Use the Shrink / Expand buttons next to the Variable 1 range, Variable 2 range, and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
To illustrate how to use this tool, we again use the data set from Figure 53. In this case, the data in columns A and B represent two independent sample sets, referred to as Variable 1 and Variable 2. Figure 56 shows the F-test results calculated for this input data using the settings shown in Figure 55.
Figure 56: Results from F-test tool
It is possible to insert different values for Alpha. The F Critical values (right-tail, left-tail, and two-tail) will be updated automatically.
Tip
For more information on F-tests, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/F-test.
The Z-test tool calculates the Z-test of two data samples. The tool performs a two sample Z-test to test the null hypothesis that there is no difference between the means of the two data sets. The Z-test works better for large samples (n > 30); if you are using a small sample, the Paired t-test tool may be more appropriate. Click Data > Statistics > Z-test on the Menu bar to access the z-test dialog shown in Figure 57 and define the required inputs to the tool.
Variable 1 range
Variable 2 range
Results to
Columns / Rows
Tip
Use the Shrink / Expand buttons next to the Variable 1 range, Variable 2 range, and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
Figure 57: z-test dialog
To provide an example of using this tool, we again make use of the input data set shown in Figure 53. In this case the data in columns A and B represent two data sets, referred to as Variable 1 and Variable 2. Figure 58 shows the Z-test results calculated for this input data using the settings shown in Figure 57.
Figure 58: Results from Z-test tool
For the Z-test tool to work properly, a known variance for each sample must be inserted in the related cell. In the example shown in Figure 58, the variances (125.076923 and 94.435897) were inserted using the formula =VAR(A1:A13) into cell E5 and the formula =VAR(B1:B13) into cell F5. The subsequent z and P values will be updated automatically.
It is also possible to insert different values for Alpha (cell E2 in the example) and Hypothesized Mean Difference (cell E3 in the example) inputs. As with the known variances changes described above, after changing the Alpha and the Hypothesized Mean Difference, the subsequent z and P values will be updated automatically.
Tip
When analyzing the Z-test results, compare the selected Alpha level with the appropriate calculated P value (depending whether a one-tailed or two-tailed test is required). If the calculated P value is smaller than the Alpha level, the hypothesis (which, in the example given, is that the means of the two data sets are the same) should be rejected.
Tip
For more information on z-tests, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Z-test.
The Test of Independence (Chi-Square) tool calculates the chi-square test of a data sample, which determines how well a set of measured values fit a corresponding set of expected values. Select Data > Statistics > Chi-square Test on the Menu bar to access the Test of Independence (Chi-Square) dialog (Figure 59).
Input range
Results to
Columns / Rows
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
Figure 59: Test of Independence (Chi-Square) dialog
To provide an example of using this tool, we again make use of the input data set shown in Figure 53. In this case the data in column A is the observed data while the data in column B are the corresponding expected values. Figure 60 shows the chi-square results calculated for this input data using the settings shown in Figure 59.
Figure 60: Results of chi-square test
It is possible to insert different values for Alpha. The Critical Value will be updated automatically.
Tip
For more information on chi-square tests, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Chi-squared_test.
The Fourier Analysis tool performs the Fourier analysis of a data set by computing the Discrete Fourier Transform (DFT) of an input array of complex numbers, using Fast Fourier Transform (FFT) algorithms. Select Data > Statistics > Fourier Analysis on the Menu bar to access the Fourier Analysis dialog (Figure 61).
Figure 61: Fourier Analysis dialog
Input range
Results to
Input range has label
Columns / Rows
Inverse
Output in polar form
Minimum magnitude for polar form output
Tip
Use the Shrink / Expand buttons next to the Input range and Results to fields if you need to shrink the dialog while selecting cells with the mouse.
Tip
Calc utilizes the small, otherwise blank area above the Help, OK, and Cancel buttons to provide feedback on erroneous selections on the dialog. For example, the text “Output address is not valid.” appears if you have not entered a valid cell range in the Results to field, and in this circumstance the OK button is grayed.
To provide an example of using this tool, we make use of the input data set shown in columns B (real values) and C (imaginary values) of the spreadsheet shown in Figure 62. The data shown in columns E (real values) and F (imaginary values) of the spreadsheet are the Fourier transform results calculated by the tool for this input data, using the settings shown in Figure 61.
Note
For those with a technical interest in the algorithms used by the Fourier Analysis tool, a radix-2 decimation-in-time FFT is used when the length of the input sequence is an even power of 2, while Bluestein’s FFT algorithm is used when the length of the input sequence is not an even power of 2.
Tip
For more information on Fourier analysis, refer to the corresponding Wikipedia article at https://en.wikipedia.org/wiki/Fourier_analysis.
Figure 62: Fourier analysis tool - example input data and results