The fundamental difference between Engineering and Data Science
I spent 4 years before starting Skyvern working at the intersection of Engineering and Data science. I had the honour of building the Search systems for 2 companies: Faire and Gopuff.
I learned something very important in this journey: Data Science and Engineering are fundamentally different fields.
How? Engineering problems tend to have a clearly defined problem and solution. They are deterministic in nature, and by virtue of that determinism have a solution that can solve that problem. Do you want to look some information up? Clear how to do it. Do you want to run some computation? Clear.
Data science problems are a little bit tricky. They are probabilistic in nature -- the opposite of deterministic.
This means that single solutions often don't exist. So you need to think about them in terms of probabilities such as confidence intervals (ie p-values), or represent the success / failures of your solution with quantitative numbers such as F-scores or Precision / Recall metrics.
This distinction is especially important in the age of LLMs. Engineers are being exposed to really capable probabilistic models, and spend so much effort trying to wrangle them to do a specific task such as answer support tickets or browse the web.
If you treat these problems as deterministic problems, you'll find yourself in edge-case hell, solving each issue one at a time.
But.. if you treat it like a data science problem.. you can run your solution N times, and represent the outputs along a curve to understand what your solution must look like to get a accurate enough solution
TL;DR? Evals are important when working on non-deterministic things