Save world from EXCEL

Save world from EXCEL.

  1. Digits
  2. Fix typos

Measure the severity of error.

No spec. Can’t use outliner analysis (no prior knowledge on distribution).

What is important/wrong. Dependency. Abstraction.

Rank suspiciousness.

Unclear why the fixed number is better.

In a spreadsheet, which is input? Which is the output?

The tool is general in a sense that it encoded general information about dependency.

Bootstrap: stat, simulation. Impact analysis. Non-parametric technique.

Use resampling and observation to know the dependency. Exclude some element (ablation + stat). Numerical sensitivity. How sampling impact the result (prediction or whatever).

Enough information for human to make simple judgment.

Root of the dependency is the observation.

Manually auditing, formula wrong — NLP?

Style: Story-telling.

Injecting errors on ground truth to evaluate a tool.

MTURK.

Geo-spatial? Information theory approach to looks for error and fixes.

You click on it, it colors your spreadsheet.

Not same color, no homogenous.

boxplot <-> median.

Distribution.

Global view <-> local view. Encode knowledge and ranking as color.

Copy & Paste error in a more general sense. Why do people (including me) like copying?

Spatial programming in Excel. Rectangles.

Reference errors.

Context. Color things based on its context. Measure how surprising it is. (Shannon, regularity, the entropy of spreadsheet).

Fingerprint vector. Something you can count. ENTROPY. https://en.wikipedia.org/wiki/Benford%27s_law