{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XYIE2WkNm1UE"
      },
      "source": [
        "# Range propagation\n",
        "[![View On GitHub](https://img.shields.io/badge/View_in_Github-grey?logo=github)](https://github.com/Qrlew/docs/blob/main/tutorials/range_propagation.ipynb)\n",
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Qrlew/pyqrlew/blob/main/examples/range_propagation.ipynb)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MzGGFbQtm1UG"
      },
      "source": [
        "When one wants to release aggregate statistics with the guarantee that the output will not reveal anything about the individuals in the data, [differential privacy](https://en.wikipedia.org/wiki/Differential_privacy) is the way to go.\n",
        "Many [differentialy private mechanisms](https://en.wikipedia.org/wiki/Differential_privacy) consist of sums where each term is known to be bounded — so that the *sensitivity* is easy to compute — to which some noise is added, usually [Laplace](https://en.wikipedia.org/wiki/Additive_noise_differential_privacy_mechanisms#Laplace_Mechanism) or [Gaussian](https://en.wikipedia.org/wiki/Additive_noise_differential_privacy_mechanisms#Gaussian_Mechanism).\n",
        "For these mechanisms and others, it is crucial to be able to bound some values.\n",
        "\n",
        "*Bounding* can be achieved in many ways.\n",
        "\n",
        "- Bounds can be *forced* by clipping values, but then the computation of the statistics may be biased.\n",
        "- Bounds can be *inferred* by ranges propagation, a range of the values is propagated across successive transforms.\n",
        "\n",
        "A case where the tradeoff between *clipping* and *propagating ranges* is particularly difficult is the case of values with few remote outliers.\n",
        "If ranges are simply propagated, the presence of outliers forces the sensitivities to be large and therefore the noise added reduces drastically the utility of the result.\n",
        "To avoid adding too much noise, the values can be clipped so that the noise added is smaller, but then the outliers are dropped and the statistics are biased.\n",
        "\n",
        "In this notebook, we'll focus on *range propagation* using [`qrlew`](https://qrlew.github.io/)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "id": "swnCxCAXm1UH"
      },
      "outputs": [],
      "source": [
        "%%capture\n",
        "!sudo apt-get -y -qq update\n",
        "!sudo apt-get -y -qq install graphviz\n",
        "!pip install graphviz\n",
        "!pip install pyqrlew"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "e0wcoOf2m1UG"
      },
      "outputs": [],
      "source": [
        "import logging\n",
        "logging.disable(logging.INFO)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ktdEFvqpm1UI"
      },
      "source": [
        "We load a csv extract of the [Kuzak Dempsy's dataset](https://data.world/kudem):"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "CwWxexE5m1UI"
      },
      "outputs": [],
      "source": [
        "import pyqrlew as pq\n",
        "from pyqrlew.io.utils import from_csv\n",
        "qdb = from_csv(\n",
        "    table_name=\"heart_data\",\n",
        "    csv_file=\"https://storage.googleapis.com/qrlew-demo-data/heart_data.csv\"\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        },
        "id": "Bp3E4wwWm1UI",
        "outputId": "9d841bbc-7d41-42c2-d058-946f1ac99988"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>age</th>\n",
              "      <th>gender</th>\n",
              "      <th>height</th>\n",
              "      <th>weight</th>\n",
              "      <th>ap_hi</th>\n",
              "      <th>ap_lo</th>\n",
              "      <th>cholesterol</th>\n",
              "      <th>gluc</th>\n",
              "      <th>smoke</th>\n",
              "      <th>alco</th>\n",
              "      <th>active</th>\n",
              "      <th>cardio</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>18393</td>\n",
              "      <td>2</td>\n",
              "      <td>168</td>\n",
              "      <td>62.0</td>\n",
              "      <td>110</td>\n",
              "      <td>80</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>20228</td>\n",
              "      <td>1</td>\n",
              "      <td>156</td>\n",
              "      <td>85.0</td>\n",
              "      <td>140</td>\n",
              "      <td>90</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>18857</td>\n",
              "      <td>1</td>\n",
              "      <td>165</td>\n",
              "      <td>64.0</td>\n",
              "      <td>130</td>\n",
              "      <td>70</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>3</td>\n",
              "      <td>17623</td>\n",
              "      <td>2</td>\n",
              "      <td>169</td>\n",
              "      <td>82.0</td>\n",
              "      <td>150</td>\n",
              "      <td>100</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>4</td>\n",
              "      <td>17474</td>\n",
              "      <td>1</td>\n",
              "      <td>156</td>\n",
              "      <td>56.0</td>\n",
              "      <td>100</td>\n",
              "      <td>60</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "   id    age  gender  height  weight  ap_hi  ap_lo  cholesterol  gluc  smoke  \\\n",
              "0   0  18393       2     168    62.0    110     80            1     1      0   \n",
              "1   1  20228       1     156    85.0    140     90            3     1      0   \n",
              "2   2  18857       1     165    64.0    130     70            3     1      0   \n",
              "3   3  17623       2     169    82.0    150    100            1     1      0   \n",
              "4   4  17474       1     156    56.0    100     60            1     1      0   \n",
              "\n",
              "   alco  active  cardio  \n",
              "0     0       1       0  \n",
              "1     0       1       1  \n",
              "2     0       0       1  \n",
              "3     0       1       1  \n",
              "4     0       0       0  "
            ]
          },
          "execution_count": 4,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "import pandas as pd\n",
        "pd.DataFrame(qdb.execute(\"SELECT * FROM heart_data\")).head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rzRsmbZ_m1UI"
      },
      "source": [
        "Qrlew transforms each SQL notion into a `Relation`, which is an intermediate representation that is well-suited for multiple query rewriting purposes.\n",
        "\n",
        "The `heart_data` table is turned into:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 274
        },
        "id": "U5N8dQpMm1UJ",
        "outputId": "e7a9ea1d-b7a5-4e3a-ac63-f207aa96a121"
      },
      "outputs": [
        {
          "data": {
            "image/svg+xml": [
              "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
              "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
              " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
              "<!-- Generated by graphviz version 2.43.0 (0)\n",
              " -->\n",
              "<!-- Title: graph_gtdh Pages: 1 -->\n",
              "<svg width=\"229pt\" height=\"190pt\"\n",
              " viewBox=\"0.00 0.00 229.00 190.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
              "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 186)\">\n",
              "<title>graph_gtdh</title>\n",
              "<polygon fill=\"transparent\" stroke=\"transparent\" points=\"-4,4 -4,-186 225,-186 225,4 -4,4\"/>\n",
              "<!-- graph_gtdh -->\n",
              "<g id=\"node1\" class=\"node\">\n",
              "<title>graph_gtdh</title>\n",
              "<path fill=\"#ff1744\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M209,-182C209,-182 12,-182 12,-182 6,-182 0,-176 0,-170 0,-170 0,-12 0,-12 0,-6 6,0 12,0 12,0 209,0 209,0 215,0 221,-6 221,-12 221,-12 221,-170 221,-170 221,-176 215,-182 209,-182\"/>\n",
              "<text text-anchor=\"start\" x=\"14\" y=\"-160.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">HEART_DATA size ∈ int{70000}</text>\n",
              "<text text-anchor=\"start\" x=\"76.5\" y=\"-149.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"66\" y=\"-138.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">age = age ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"48\" y=\"-127.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gender = gender ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"51.5\" y=\"-116.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">height = height ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"44\" y=\"-105.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">weight = weight ∈ float</text>\n",
              "<text text-anchor=\"start\" x=\"57.5\" y=\"-94.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_hi = ap_hi ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"57.5\" y=\"-83.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_lo = ap_lo ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"25.5\" y=\"-72.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cholesterol = cholesterol ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"63.5\" y=\"-61.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gluc = gluc ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"50\" y=\"-50.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">smoke = smoke ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"63.5\" y=\"-39.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">alco = alco ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"52.5\" y=\"-28.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">active = active ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"52.5\" y=\"-17.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cardio = cardio ∈ int</text>\n",
              "</g>\n",
              "</g>\n",
              "</svg>\n"
            ],
            "text/plain": [
              "<graphviz.sources.Source at 0x7f2f79dd64f0>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "import graphviz\n",
        "\n",
        "ds = qdb.dataset()\n",
        "display(graphviz.Source(ds.relations()[0][1].dot()))\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aZBu3cETm1UJ"
      },
      "source": [
        "The `Relation` object holds information about columns and their associated Qrlew [data types](https://github.com/Qrlew/qrlew/blob/b4960d57b7ac047b525c36b9cb9eb3395e0f4029/src/data_type/mod.rs#L2207) (including bounds).\n",
        "\n",
        "These Qrlew data types are transformed from the database's original types.\n",
        "Therefore, when importing data from sources like CSV files or pandas DataFrames that lack support for certain types (such as bytes or lists),\n",
        "you may sacrifice the detailed distinctions that Qrlew offers."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "unXS6lr9m1UJ"
      },
      "source": [
        "## Bound the columns of a table"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2v0keN98m1UJ"
      },
      "source": [
        "We adopt the perspective of the data owner, who aims to safeguard user privacy. We focus on a subset of the heart_data containing four columns:\n",
        "\n",
        "- `id` (integer):  contains unique identifiers, which must remain confidential,\n",
        "- `gender` (integer): 1 (Male) or 2 (Female),\n",
        "- `height` (integer),\n",
        "- `weight` (integer).\n",
        "\n",
        "We need to bounds the `height` and `weight` columns.\n",
        "We can limit the height to a range of 140 to 200 and the weight to a range of 40 to 130 kg.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nYhXT6vnm1UJ"
      },
      "source": [
        "The data preparation involves creating a new dataset with specific column transformations:\n",
        "- `id` (integer): remains unchanged\n",
        "- `gender` (string): exclude values other than 1 or 2 then replace 0 with 'M', 1 with 'F',\n",
        "- `height` (float): exclude values outside the range [140.0, 200.0],\n",
        "- `weight` (float): exclude values outside the range [40.0, 130.0].\n",
        "\n",
        "This preparation can be transcripted as an SQL query:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "id": "4tf4keJxm1UK"
      },
      "outputs": [],
      "source": [
        "query = \"\"\"\n",
        "WITH\n",
        "    bounds_table AS (SELECT\n",
        "        id,\n",
        "        height,\n",
        "        weight,\n",
        "        CASE WHEN id = 1 THEN 'M' ELSE 'F' END AS gender\n",
        "    FROM heart_data\n",
        "    WHERE\n",
        "        height > 140. AND height < 200. AND\n",
        "        weight > 40. AND weight < 130. AND\n",
        "        gender IN (1, 2)\n",
        "    )\n",
        "SELECT * FROM bounds_table\n",
        "\"\"\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IFUxI1jSm1UK"
      },
      "source": [
        "We define a new `Relation` that mirrors the dataset's SQL query operation:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 588
        },
        "id": "nUxrF4sjm1UK",
        "outputId": "67abaf1e-e7e0-4d85-c655-97d47b6dd7ea"
      },
      "outputs": [
        {
          "data": {
            "image/svg+xml": [
              "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
              "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
              " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
              "<!-- Generated by graphviz version 2.43.0 (0)\n",
              " -->\n",
              "<!-- Title: graph_663l Pages: 1 -->\n",
              "<svg width=\"641pt\" height=\"425pt\"\n",
              " viewBox=\"0.00 0.00 641.00 425.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
              "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 421)\">\n",
              "<title>graph_663l</title>\n",
              "<polygon fill=\"transparent\" stroke=\"transparent\" points=\"-4,4 -4,-421 637,-421 637,4 -4,4\"/>\n",
              "<!-- graph_663l -->\n",
              "<g id=\"node1\" class=\"node\">\n",
              "<title>graph_663l</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M412,-417C412,-417 221,-417 221,-417 215,-417 209,-411 209,-405 209,-405 209,-346 209,-346 209,-340 215,-334 221,-334 221,-334 412,-334 412,-334 418,-334 424,-340 424,-346 424,-346 424,-405 424,-405 424,-411 418,-417 412,-417\"/>\n",
              "<text text-anchor=\"start\" x=\"223\" y=\"-395.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_LTEW size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"282.5\" y=\"-384.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"231\" y=\"-373.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">height = height ∈ int[140 200]</text>\n",
              "<text text-anchor=\"start\" x=\"227\" y=\"-362.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">weight = weight ∈ float[40 130]</text>\n",
              "<text text-anchor=\"start\" x=\"235\" y=\"-351.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">gender = gender ∈ str{F, M}</text>\n",
              "</g>\n",
              "<!-- graph_17bh -->\n",
              "<g id=\"node2\" class=\"node\">\n",
              "<title>graph_17bh</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M621,-305C621,-305 12,-305 12,-305 6,-305 0,-299 0,-293 0,-293 0,-223 0,-223 0,-217 6,-211 12,-211 12,-211 621,-211 621,-211 627,-211 633,-217 633,-223 633,-223 633,-293 633,-293 633,-299 627,-305 621,-305\"/>\n",
              "<text text-anchor=\"start\" x=\"220.5\" y=\"-283.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_GK0W size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"282.5\" y=\"-272.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"231\" y=\"-261.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">height = height ∈ int[140 200]</text>\n",
              "<text text-anchor=\"start\" x=\"227\" y=\"-250.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">weight = weight ∈ float[40 130]</text>\n",
              "<text text-anchor=\"start\" x=\"200.5\" y=\"-239.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">gender = case((id = 1), M, F) ∈ str{F, M}</text>\n",
              "<text text-anchor=\"start\" x=\"14\" y=\"-228.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">WHERE (((((height &gt; 140) and (height &lt; 200)) and (weight &gt; 40)) and (weight &lt; 130)) and (gender in (1, 2)))</text>\n",
              "</g>\n",
              "<!-- graph_663l&#45;&gt;graph_17bh -->\n",
              "<g id=\"edge2\" class=\"edge\">\n",
              "<title>graph_663l&#45;&gt;graph_17bh</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M316.5,-333.78C316.5,-327.87 316.5,-321.7 316.5,-315.55\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"320,-315.26 316.5,-305.26 313,-315.26 320,-315.26\"/>\n",
              "</g>\n",
              "<!-- graph_gtdh -->\n",
              "<g id=\"node3\" class=\"node\">\n",
              "<title>graph_gtdh</title>\n",
              "<path fill=\"#ff1744\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M415,-182C415,-182 218,-182 218,-182 212,-182 206,-176 206,-170 206,-170 206,-12 206,-12 206,-6 212,0 218,0 218,0 415,0 415,0 421,0 427,-6 427,-12 427,-12 427,-170 427,-170 427,-176 421,-182 415,-182\"/>\n",
              "<text text-anchor=\"start\" x=\"220\" y=\"-160.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">HEART_DATA size ∈ int{70000}</text>\n",
              "<text text-anchor=\"start\" x=\"282.5\" y=\"-149.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"272\" y=\"-138.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">age = age ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"254\" y=\"-127.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gender = gender ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"257.5\" y=\"-116.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">height = height ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"250\" y=\"-105.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">weight = weight ∈ float</text>\n",
              "<text text-anchor=\"start\" x=\"263.5\" y=\"-94.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_hi = ap_hi ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"263.5\" y=\"-83.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_lo = ap_lo ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"231.5\" y=\"-72.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cholesterol = cholesterol ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"269.5\" y=\"-61.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gluc = gluc ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"256\" y=\"-50.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">smoke = smoke ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"269.5\" y=\"-39.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">alco = alco ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"258.5\" y=\"-28.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">active = active ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"258.5\" y=\"-17.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cardio = cardio ∈ int</text>\n",
              "</g>\n",
              "<!-- graph_17bh&#45;&gt;graph_gtdh -->\n",
              "<g id=\"edge1\" class=\"edge\">\n",
              "<title>graph_17bh&#45;&gt;graph_gtdh</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M316.5,-210.6C316.5,-204.72 316.5,-198.53 316.5,-192.2\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"320,-192.16 316.5,-182.16 313,-192.16 320,-192.16\"/>\n",
              "</g>\n",
              "</g>\n",
              "</svg>\n"
            ],
            "text/plain": [
              "<graphviz.sources.Source at 0x7f2f78e227f0>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "relation = ds.relation(query)\n",
        "display(graphviz.Source(relation.dot()))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9OuN3xE2m1UK"
      },
      "source": [
        "At the basis in red, we find the original table `heart_data` with all its columns.\n",
        "\n",
        "The first mapping operation involves selecting the `id`, `gender`, `height` and `weight` columns and coercing their types.\n",
        "\n",
        "The output `Relation` contains the four columns with their datatypes propagated:\n",
        "\n",
        "- The `gender` column has `str` type with only two possible values `M` and `F`.\n",
        "- The `height` and `weigh` columns contain bounded floats."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "09UoZUobm1UK",
        "outputId": "2b9c824b-2cb5-4311-f007-77dd82889542"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "The propagated datatype is: {id: int, height: int[140 200], weight: float[40 130], gender: str{F, M}}\n"
          ]
        }
      ],
      "source": [
        "print(f\"The propagated datatype is: {relation.schema()}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z2tOcItcm1UK"
      },
      "source": [
        "Importantly, these data types have been extended **independently, without necessitating any interaction with the database**.\n",
        "\n",
        "At this stage, the computation of aggregation sensitivity becomes feasible, but exclusively when such aggregation is executed on one of the initial columns.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HDRfsFwkm1UL"
      },
      "source": [
        "## Range propagation"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AVh7ZIY2m1UL"
      },
      "source": [
        "When confronted with the aggregation of composite columns, one approach to gauge sensitivity involves employing the automatic bounds determination algorithm introduced by [Wilson et al. (2019)](https://arxiv.org/abs/1909.01917).\n",
        "\n",
        "However, a drawback of this method is that it consumes some of the privacy budget allocated for the aggregation.\n",
        "\n",
        "In the next section, we'll delve into how Qrlew extends the boundaries of the initial columns to composite columns without spending any privacy budget."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ebixYutIm1UL"
      },
      "source": [
        "As a first example, we reuse the previous query and compute the BMI (Body Mass Index) using the formula:\n",
        "$$BMI = \\frac{weight(kg)}{height(m)^2}$$"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 664
        },
        "id": "3mgb3cLHm1UL",
        "outputId": "135a389b-2a7f-4622-cedc-a9f0939556d2"
      },
      "outputs": [
        {
          "data": {
            "image/svg+xml": [
              "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
              "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
              " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
              "<!-- Generated by graphviz version 2.43.0 (0)\n",
              " -->\n",
              "<!-- Title: graph_8ic1 Pages: 1 -->\n",
              "<svg width=\"739pt\" height=\"482pt\"\n",
              " viewBox=\"0.00 0.00 739.00 482.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
              "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 478)\">\n",
              "<title>graph_8ic1</title>\n",
              "<polygon fill=\"transparent\" stroke=\"transparent\" points=\"-4,4 -4,-478 735,-478 735,4 -4,4\"/>\n",
              "<!-- graph_8ic1 -->\n",
              "<g id=\"node1\" class=\"node\">\n",
              "<title>graph_8ic1</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M603.5,-474C603.5,-474 127.5,-474 127.5,-474 121.5,-474 115.5,-468 115.5,-462 115.5,-462 115.5,-436 115.5,-436 115.5,-430 121.5,-424 127.5,-424 127.5,-424 603.5,-424 603.5,-424 609.5,-424 615.5,-430 615.5,-436 615.5,-436 615.5,-462 615.5,-462 615.5,-468 609.5,-474 603.5,-474\"/>\n",
              "<text text-anchor=\"start\" x=\"273.5\" y=\"-452.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_S_LZ size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"129.5\" y=\"-441.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">bmi = (weight / (height_in_meter * height_in_meter)) ∈ float[10 66.32653061224488]</text>\n",
              "</g>\n",
              "<!-- graph_ykng -->\n",
              "<g id=\"node2\" class=\"node\">\n",
              "<title>graph_ykng</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M719,-395C719,-395 12,-395 12,-395 6,-395 0,-389 0,-383 0,-383 0,-346 0,-346 0,-340 6,-334 12,-334 12,-334 719,-334 719,-334 725,-334 731,-340 731,-346 731,-346 731,-383 731,-383 731,-389 725,-395 719,-395\"/>\n",
              "<text text-anchor=\"start\" x=\"272.5\" y=\"-373.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_88Z0 size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"14\" y=\"-362.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">height_in_meter = (height * 0.01) ∈ float{1.4000000000000001, 1.41, 1.42, 1.43, 1.44, 1.45, 1.46, 1.47, 1.48, 1.49, 1.5, 1.51...</text>\n",
              "<text text-anchor=\"start\" x=\"276\" y=\"-351.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">weight = weight ∈ float[40 130]</text>\n",
              "</g>\n",
              "<!-- graph_8ic1&#45;&gt;graph_ykng -->\n",
              "<g id=\"edge3\" class=\"edge\">\n",
              "<title>graph_8ic1&#45;&gt;graph_ykng</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M365.5,-423.69C365.5,-417.97 365.5,-411.72 365.5,-405.5\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"369,-405.18 365.5,-395.18 362,-405.18 369,-405.18\"/>\n",
              "</g>\n",
              "<!-- graph_17bh -->\n",
              "<g id=\"node3\" class=\"node\">\n",
              "<title>graph_17bh</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M670,-305C670,-305 61,-305 61,-305 55,-305 49,-299 49,-293 49,-293 49,-223 49,-223 49,-217 55,-211 61,-211 61,-211 670,-211 670,-211 676,-211 682,-217 682,-223 682,-223 682,-293 682,-293 682,-299 676,-305 670,-305\"/>\n",
              "<text text-anchor=\"start\" x=\"269.5\" y=\"-283.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_GK0W size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"331.5\" y=\"-272.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"280\" y=\"-261.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">height = height ∈ int[140 200]</text>\n",
              "<text text-anchor=\"start\" x=\"276\" y=\"-250.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">weight = weight ∈ float[40 130]</text>\n",
              "<text text-anchor=\"start\" x=\"249.5\" y=\"-239.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">gender = case((id = 1), M, F) ∈ str{F, M}</text>\n",
              "<text text-anchor=\"start\" x=\"63\" y=\"-228.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">WHERE (((((height &gt; 140) and (height &lt; 200)) and (weight &gt; 40)) and (weight &lt; 130)) and (gender in (1, 2)))</text>\n",
              "</g>\n",
              "<!-- graph_ykng&#45;&gt;graph_17bh -->\n",
              "<g id=\"edge2\" class=\"edge\">\n",
              "<title>graph_ykng&#45;&gt;graph_17bh</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M365.5,-333.83C365.5,-327.97 365.5,-321.64 365.5,-315.23\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"369,-315.15 365.5,-305.15 362,-315.15 369,-315.15\"/>\n",
              "</g>\n",
              "<!-- graph_gtdh -->\n",
              "<g id=\"node4\" class=\"node\">\n",
              "<title>graph_gtdh</title>\n",
              "<path fill=\"#ff1744\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M464,-182C464,-182 267,-182 267,-182 261,-182 255,-176 255,-170 255,-170 255,-12 255,-12 255,-6 261,0 267,0 267,0 464,0 464,0 470,0 476,-6 476,-12 476,-12 476,-170 476,-170 476,-176 470,-182 464,-182\"/>\n",
              "<text text-anchor=\"start\" x=\"269\" y=\"-160.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">HEART_DATA size ∈ int{70000}</text>\n",
              "<text text-anchor=\"start\" x=\"331.5\" y=\"-149.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"321\" y=\"-138.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">age = age ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"303\" y=\"-127.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gender = gender ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"306.5\" y=\"-116.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">height = height ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"299\" y=\"-105.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">weight = weight ∈ float</text>\n",
              "<text text-anchor=\"start\" x=\"312.5\" y=\"-94.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_hi = ap_hi ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"312.5\" y=\"-83.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_lo = ap_lo ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"280.5\" y=\"-72.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cholesterol = cholesterol ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"318.5\" y=\"-61.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gluc = gluc ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"305\" y=\"-50.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">smoke = smoke ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"318.5\" y=\"-39.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">alco = alco ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"307.5\" y=\"-28.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">active = active ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"307.5\" y=\"-17.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cardio = cardio ∈ int</text>\n",
              "</g>\n",
              "<!-- graph_17bh&#45;&gt;graph_gtdh -->\n",
              "<g id=\"edge1\" class=\"edge\">\n",
              "<title>graph_17bh&#45;&gt;graph_gtdh</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M365.5,-210.6C365.5,-204.72 365.5,-198.53 365.5,-192.2\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"369,-192.16 365.5,-182.16 362,-192.16 369,-192.16\"/>\n",
              "</g>\n",
              "</g>\n",
              "</svg>\n"
            ],
            "text/plain": [
              "<graphviz.sources.Source at 0x7f2f79d94a90>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "query = \"\"\"\n",
        "WITH\n",
        "    bounds_table AS (\n",
        "        SELECT\n",
        "            id,\n",
        "            height,\n",
        "            weight,\n",
        "            CASE WHEN id = 1 THEN 'M' ELSE 'F' END AS gender\n",
        "        FROM heart_data\n",
        "        WHERE\n",
        "            height > 140 AND height < 200 AND\n",
        "            weight > 40.00 AND weight < 130. AND\n",
        "            gender IN (1, 2)\n",
        "    ),\n",
        "    convert_table AS (\n",
        "        SELECT\n",
        "            height * 0.01 AS height_in_meter,\n",
        "            weight\n",
        "        FROM bounds_table\n",
        "    )\n",
        "SELECT weight / (height_in_meter * height_in_meter) AS bmi FROM convert_table\n",
        "\"\"\"\n",
        "relation = ds.relation(query)\n",
        "display(graphviz.Source(relation.dot()))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3HVRBHEvm1UL"
      },
      "source": [
        "The two lower relations are the ones we previously had, which are fed into a new `Relation` that performs the conversion of the `height` column from centimeters to meters.\n",
        "\n",
        "The updated ranges are automatically calculated as `[140, 200] -> [1.4, 2.0]`.\n",
        "\n",
        "The BMI computation takes place in the uppermost `Relation`, the corresponding ranges are:\n",
        "$$\n",
        "\\left [ \\frac{\\min (weight(kg))}{\\max (height(m)^2)}, \\frac{\\max (weight(kg))}{\\min (height(m)^2)} \\right]\n",
        "$$"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "id": "wayljlD2m1UL",
        "outputId": "207a8802-655d-49a5-87ec-720768a82f63"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "'{bmi: float[10 66.32653061224488]}'"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "relation.schema()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "09U0zzXGm1UM"
      },
      "source": [
        "This can be converted to the true min and max:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "YUvl7idWm1UM",
        "outputId": "a4d9c76f-3af0-47d2-c872-0fcfb645ddbd"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "bmi: [14.527376033057852, 63.975401706010715]\n"
          ]
        }
      ],
      "source": [
        "df = pd.DataFrame(qdb.eval(relation))\n",
        "print(f\"bmi: [{df['bmi'].min()}, {df['bmi'].max()}]\")\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "76ipghMZm1UM"
      },
      "source": [
        "We observe that the estimated bounds encompass the actual bounds. This implies that we might be introducing an excessive amount of noise. If we aim to incorporate less noise, we can tighten the bounds; however, this adjustment could potentially introduce bias into the final outcome."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Uwnzvgexm1UM"
      },
      "source": [
        "Let now consider another example. We want to compute the [Lorentz formula](https://link.springer.com/chapter/10.1007/978-3-211-89836-9_803) given by:\n",
        "\n",
        "$$\n",
        "\\left\\{\n",
        "    \\begin{array}{ll}\n",
        "        0.75 * height - 62.5 & \\text{if  gender='M'} \\\\\n",
        "        0.60 * height -40.0 & \\text{if  gender='F'} \\\\\n",
        "    \\end{array}\n",
        "\\right.\n",
        "$$\n",
        "\n",
        "**To illustrate the process of joining**, we calculate the Lorentz formula separately for males and females using two distinct common table expressions.\n",
        "Subsequently, we merge these tables through a join operation and retrieve the appropriate formula based on the individual's gender."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 892
        },
        "id": "l3XGslVXm1UM",
        "outputId": "7b502a5f-c01d-4b53-ebb1-f890b93dd93a"
      },
      "outputs": [
        {
          "data": {
            "image/svg+xml": [
              "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
              "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
              " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
              "<!-- Generated by graphviz version 2.43.0 (0)\n",
              " -->\n",
              "<!-- Title: graph_r62l Pages: 1 -->\n",
              "<svg width=\"1504pt\" height=\"638pt\"\n",
              " viewBox=\"0.00 0.00 1503.50 638.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
              "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 634)\">\n",
              "<title>graph_r62l</title>\n",
              "<polygon fill=\"transparent\" stroke=\"transparent\" points=\"-4,4 -4,-634 1499.5,-634 1499.5,4 -4,4\"/>\n",
              "<!-- graph_r62l -->\n",
              "<g id=\"node1\" class=\"node\">\n",
              "<title>graph_r62l</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M1110.5,-630C1110.5,-630 409.5,-630 409.5,-630 403.5,-630 397.5,-624 397.5,-618 397.5,-618 397.5,-592 397.5,-592 397.5,-586 403.5,-580 409.5,-580 409.5,-580 1110.5,-580 1110.5,-580 1116.5,-580 1122.5,-586 1122.5,-592 1122.5,-592 1122.5,-618 1122.5,-618 1122.5,-624 1116.5,-630 1110.5,-630\"/>\n",
              "<text text-anchor=\"start\" x=\"647\" y=\"-608.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_N6O9 size ∈ int[0 4900000000]</text>\n",
              "<text text-anchor=\"start\" x=\"411.5\" y=\"-597.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">lorentz = case((field_ey3i = M), field_jvtu, field_koth) ∈ option(float{42.5, 43.25, 44, 44.599999999999994, 44.75, 45.2, 45....</text>\n",
              "</g>\n",
              "<!-- graph_o91d -->\n",
              "<g id=\"node2\" class=\"node\">\n",
              "<title>graph_o91d</title>\n",
              "<path fill=\"#ff616f\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M1134,-551C1134,-551 386,-551 386,-551 380,-551 374,-545 374,-539 374,-539 374,-458 374,-458 374,-452 380,-446 386,-446 386,-446 1134,-446 1134,-446 1140,-446 1146,-452 1146,-458 1146,-458 1146,-539 1146,-539 1146,-545 1140,-551 1134,-551\"/>\n",
              "<text text-anchor=\"start\" x=\"649.5\" y=\"-529.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">JOIN__YSG size ∈ int[0 4900000000]</text>\n",
              "<text text-anchor=\"start\" x=\"683\" y=\"-518.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">field_p7b0 = _LEFT_.id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"652\" y=\"-507.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">field_ey3i = _LEFT_.gender ∈ str{F, M}</text>\n",
              "<text text-anchor=\"start\" x=\"388\" y=\"-496.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">field_koth = _LEFT_.female_lorentz ∈ float{44, 44.599999999999994, 45.2, 45.8, 46.39999999999999, 47, 47.599999999999994, 48....</text>\n",
              "<text text-anchor=\"start\" x=\"658\" y=\"-485.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">field_cq9z = _RIGHT_.id ∈ option(int)</text>\n",
              "<text text-anchor=\"start\" x=\"412.5\" y=\"-474.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">field_jvtu = _RIGHT_.male_lorentz ∈ option(float{42.5, 43.25, 44, 44.75, 45.5, 46.25, 47, 47.75, 48.5, 49.25, 50, 50.75, 51.5...</text>\n",
              "<text text-anchor=\"start\" x=\"668\" y=\"-463.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">LEFT ON (_LEFT_.id = _RIGHT_.id)</text>\n",
              "</g>\n",
              "<!-- graph_r62l&#45;&gt;graph_o91d -->\n",
              "<g id=\"edge6\" class=\"edge\">\n",
              "<title>graph_r62l&#45;&gt;graph_o91d</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M760,-579.76C760,-574.14 760,-567.91 760,-561.48\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"763.5,-561.25 760,-551.25 756.5,-561.25 763.5,-561.25\"/>\n",
              "</g>\n",
              "<!-- graph_aaxr -->\n",
              "<g id=\"node3\" class=\"node\">\n",
              "<title>graph_aaxr</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M754,-417C754,-417 12,-417 12,-417 6,-417 0,-411 0,-405 0,-405 0,-357 0,-357 0,-351 6,-345 12,-345 12,-345 754,-345 754,-345 760,-345 766,-351 766,-357 766,-357 766,-405 766,-405 766,-411 760,-417 754,-417\"/>\n",
              "<text text-anchor=\"start\" x=\"290.5\" y=\"-395.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_26EE size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"349\" y=\"-384.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"301.5\" y=\"-373.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">gender = gender ∈ str{F, M}</text>\n",
              "<text text-anchor=\"start\" x=\"14\" y=\"-362.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">female_lorentz = ((0.6 * height) &#45; 40) ∈ float{44, 44.599999999999994, 45.2, 45.8, 46.39999999999999, 47, 47.599999999999994,...</text>\n",
              "</g>\n",
              "<!-- graph_o91d&#45;&gt;graph_aaxr -->\n",
              "<g id=\"edge4\" class=\"edge\">\n",
              "<title>graph_o91d&#45;&gt;graph_aaxr</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M591.9,-446C563.39,-437.27 534.26,-428.34 507.14,-420.03\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"507.84,-416.58 497.25,-417 505.78,-423.28 507.84,-416.58\"/>\n",
              "</g>\n",
              "<!-- graph_eh46 -->\n",
              "<g id=\"node4\" class=\"node\">\n",
              "<title>graph_eh46</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M1483.5,-411.5C1483.5,-411.5 792.5,-411.5 792.5,-411.5 786.5,-411.5 780.5,-405.5 780.5,-399.5 780.5,-399.5 780.5,-362.5 780.5,-362.5 780.5,-356.5 786.5,-350.5 792.5,-350.5 792.5,-350.5 1483.5,-350.5 1483.5,-350.5 1489.5,-350.5 1495.5,-356.5 1495.5,-362.5 1495.5,-362.5 1495.5,-399.5 1495.5,-399.5 1495.5,-405.5 1489.5,-411.5 1483.5,-411.5\"/>\n",
              "<text text-anchor=\"start\" x=\"1044.5\" y=\"-389.7\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_4N0T size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"1104\" y=\"-378.7\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"794.5\" y=\"-367.7\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">male_lorentz = ((0.75 * height) &#45; 62.5) ∈ float{42.5, 43.25, 44, 44.75, 45.5, 46.25, 47, 47.75, 48.5, 49.25, 50, 50.75, 51.5,...</text>\n",
              "</g>\n",
              "<!-- graph_o91d&#45;&gt;graph_eh46 -->\n",
              "<g id=\"edge5\" class=\"edge\">\n",
              "<title>graph_o91d&#45;&gt;graph_eh46</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M928.55,-446C963.52,-435.31 999.43,-424.34 1031.43,-414.56\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"1032.69,-417.84 1041.23,-411.57 1030.64,-411.14 1032.69,-417.84\"/>\n",
              "</g>\n",
              "<!-- graph_daib -->\n",
              "<g id=\"node5\" class=\"node\">\n",
              "<title>graph_daib</title>\n",
              "<path fill=\"#428e92\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M1064.5,-316C1064.5,-316 455.5,-316 455.5,-316 449.5,-316 443.5,-310 443.5,-304 443.5,-304 443.5,-223 443.5,-223 443.5,-217 449.5,-211 455.5,-211 455.5,-211 1064.5,-211 1064.5,-211 1070.5,-211 1076.5,-217 1076.5,-223 1076.5,-223 1076.5,-304 1076.5,-304 1076.5,-310 1070.5,-316 1064.5,-316\"/>\n",
              "<text text-anchor=\"start\" x=\"666.5\" y=\"-294.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">MAP_205O size ∈ int[0 70000]</text>\n",
              "<text text-anchor=\"start\" x=\"726\" y=\"-283.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"715.5\" y=\"-272.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">age = age ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"674.5\" y=\"-261.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">height = height ∈ int[140 200]</text>\n",
              "<text text-anchor=\"start\" x=\"670.5\" y=\"-250.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">weight = weight ∈ float[40 130]</text>\n",
              "<text text-anchor=\"start\" x=\"630\" y=\"-239.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">gender = case((gender = 1), M, F) ∈ str{F, M}</text>\n",
              "<text text-anchor=\"start\" x=\"457.5\" y=\"-228.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#000000\" fill-opacity=\"0.733333\">WHERE (((((height &gt; 140) and (height &lt; 200)) and (weight &gt; 40)) and (weight &lt; 130)) and (gender in (1, 2)))</text>\n",
              "</g>\n",
              "<!-- graph_aaxr&#45;&gt;graph_daib -->\n",
              "<g id=\"edge3\" class=\"edge\">\n",
              "<title>graph_aaxr&#45;&gt;graph_daib</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M497.39,-344.95C524.16,-336.75 553.3,-327.83 582.1,-319\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"583.32,-322.29 591.86,-316.01 581.27,-315.6 583.32,-322.29\"/>\n",
              "</g>\n",
              "<!-- graph_eh46&#45;&gt;graph_daib -->\n",
              "<g id=\"edge2\" class=\"edge\">\n",
              "<title>graph_eh46&#45;&gt;graph_daib</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M1041.15,-350.41C1009.86,-340.84 974.1,-329.92 938.82,-319.14\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"939.48,-315.68 928.89,-316.11 937.43,-322.37 939.48,-315.68\"/>\n",
              "</g>\n",
              "<!-- graph_gtdh -->\n",
              "<g id=\"node6\" class=\"node\">\n",
              "<title>graph_gtdh</title>\n",
              "<path fill=\"#ff1744\" stroke=\"#000000\" stroke-opacity=\"0.333333\" d=\"M858.5,-182C858.5,-182 661.5,-182 661.5,-182 655.5,-182 649.5,-176 649.5,-170 649.5,-170 649.5,-12 649.5,-12 649.5,-6 655.5,0 661.5,0 661.5,0 858.5,0 858.5,0 864.5,0 870.5,-6 870.5,-12 870.5,-12 870.5,-170 870.5,-170 870.5,-176 864.5,-182 858.5,-182\"/>\n",
              "<text text-anchor=\"start\" x=\"663.5\" y=\"-160.2\" font-family=\"Red Hat Display,sans-serif\" font-weight=\"bold\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">HEART_DATA size ∈ int{70000}</text>\n",
              "<text text-anchor=\"start\" x=\"726\" y=\"-149.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">id = id ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"715.5\" y=\"-138.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">age = age ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"697.5\" y=\"-127.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gender = gender ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"701\" y=\"-116.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">height = height ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"693.5\" y=\"-105.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">weight = weight ∈ float</text>\n",
              "<text text-anchor=\"start\" x=\"707\" y=\"-94.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_hi = ap_hi ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"707\" y=\"-83.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">ap_lo = ap_lo ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"675\" y=\"-72.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cholesterol = cholesterol ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"713\" y=\"-61.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">gluc = gluc ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"699.5\" y=\"-50.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">smoke = smoke ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"713\" y=\"-39.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">alco = alco ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"702\" y=\"-28.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">active = active ∈ int</text>\n",
              "<text text-anchor=\"start\" x=\"702\" y=\"-17.2\" font-family=\"Red Hat Display,sans-serif\" font-size=\"11.00\" fill=\"#ffffff\" fill-opacity=\"0.733333\">cardio = cardio ∈ int</text>\n",
              "</g>\n",
              "<!-- graph_daib&#45;&gt;graph_gtdh -->\n",
              "<g id=\"edge1\" class=\"edge\">\n",
              "<title>graph_daib&#45;&gt;graph_gtdh</title>\n",
              "<path fill=\"none\" stroke=\"#2b303a\" d=\"M760,-210.88C760,-204.9 760,-198.65 760,-192.29\"/>\n",
              "<polygon fill=\"#2b303a\" stroke=\"#2b303a\" points=\"763.5,-192.23 760,-182.23 756.5,-192.23 763.5,-192.23\"/>\n",
              "</g>\n",
              "</g>\n",
              "</svg>\n"
            ],
            "text/plain": [
              "<graphviz.sources.Source at 0x7f2f79cd78e0>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "query = \"\"\"\n",
        "WITH\n",
        "    bounds_table AS (\n",
        "        SELECT\n",
        "            id,\n",
        "            age,\n",
        "            height,\n",
        "            weight,\n",
        "            CASE WHEN gender = 1 THEN 'M' ELSE 'F' END AS gender\n",
        "        FROM heart_data\n",
        "        WHERE\n",
        "            height > 140 AND height < 200 AND\n",
        "            weight > 40.00 AND weight < 130. AND\n",
        "            gender IN (1, 2)\n",
        "    ),\n",
        "    female_table AS (\n",
        "        SELECT\n",
        "            id,\n",
        "            gender,\n",
        "            0.6 * height -40.0 AS female_lorentz\n",
        "        FROM bounds_table\n",
        "    ),\n",
        "    male_table AS (\n",
        "        SELECT\n",
        "            id,\n",
        "            0.75 * height - 62.5 AS male_lorentz\n",
        "        FROM bounds_table\n",
        "    )\n",
        "SELECT CASE WHEN gender = 'M' THEN male_lorentz else female_lorentz END AS lorentz FROM female_table LEFT JOIN male_table ON female_table.id = male_table.id\n",
        "\"\"\"\n",
        "relation = ds.relation(query)\n",
        "\n",
        "display(graphviz.Source(relation.dot()))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HrDD8tkXm1UM"
      },
      "source": [
        "Once more, in this example, you can trace follow the propagation of the ranges as they propagate throughout all the relations.\n",
        "\n",
        "Finally, the propagated ranges for the Lorentz formula are:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 107
        },
        "id": "wWRw3r4nm1UN",
        "outputId": "10966293-f7f2-4b6b-beaf-29c2617645e6"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "'{lorentz: option(float{42.5, 43.25, 44, 44.599999999999994, 44.75, 45.2, 45.5, 45.8, 46.25, 46.39999999999999, 47, 47.599999999999994, 47.75, 48.2, 48.5, 48.8, 49.25, 49.39999999999999, 50, 50.599999999999994, 50.75, 51.2, 51.5, 51.8, 52.25, 52.39999999999999, 53, 53.599999999999994, 53.75, 54.2, 54.5, 54.8, 55.25, 55.39999999999999, 56, 56.599999999999994, 56.75, 57.2, 57.5, 57.8, 58.25, 58.39999999999999, 59, 59.599999999999994, 59.75, 60.2, 60.5, 60.8, 61.25, 61.39999999999999, 62, 62.599999999999994, 62.75, 63.2, 63.5, 63.8, 64.25, 64.39999999999999, 65, 65.6, 65.75, 66.2, 66.5, 66.8, 67.25, 67.39999999999999, 68, 68.6, 68.75, 69.2, 69.5, 69.8, 70.25, 70.39999999999999, 71, 71.6, 71.75, 72.2, 72.5, 72.8, 73.25, 73.39999999999999, 74, 74.6, 74.75, 75.19999999999999, 75.5, 75.8, 76.25, 76.39999999999999, 77, 77.6, 77.75, 78.19999999999999, 78.5, 78.8, 79.25, 79.39999999999999, 80, 80.75, 81.5, 82.25, 83, 83.75, 84.5, 85.25, 86, 86.75, 87.5})}'"
            ]
          },
          "execution_count": 13,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "relation.schema()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dOBl5-p_m1UN"
      },
      "source": [
        "The exact ranges are:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "nyGB7fxLm1UN",
        "outputId": "77f70fa4-9c65-47c9-a582-d8f75d67fae8"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "lorentz: [43.25, 86.0]\n"
          ]
        }
      ],
      "source": [
        "df = pd.DataFrame(qdb.eval(relation))\n",
        "print(f\"lorentz: [{df['lorentz'].min()}, {df['lorentz'].max()}]\")\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bjrul39Rm1UN"
      },
      "source": [
        "In certain scenarios, when there's a high correlation between columns, it might not be advisable to use propagated ranges.\n",
        "This is because the accurate range could be influenced by significant correlations that are not evident from external data.\n",
        "In such instances, it could be beneficial to allocate budget towards computing ranges using the automatic boundary determination algorithm ([Wilson et al. 2019](https://arxiv.org/abs/1909.01917))."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3RgBQGUKm1UN"
      },
      "source": [
        "However, how can an analyst without free database access determine whether he should allocate budget to compute the ranges?\n",
        "\n",
        "This is where utilizing Sarus becomes advantageous.\n",
        "\n",
        "By employing [Sarus](https://www.sarus.tech/), the analyst gains **access to a synthetic dataset** wherein inter-column correlations are replicated.\n",
        "\n",
        "This empowers the analyst to **devise a strategy without expending privacy resources**, as they can scrutinize the synthetic dataset for insights and make informed decisions."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "myenv",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.18"
    },
    "orig_nbformat": 4
  },
  "nbformat": 4,
  "nbformat_minor": 0
}